4 TeraGrid Sites Have Focal Points:
SDSC – The Data Place
Large-scale and high-performance data analysis/handling
Every Cluster Node is Directly Attached to SAN
NCSA – The Compute Place
Large-scale, Large Flops computation
Argonne – The Viz place
Scalable Viz walls
Caltech – The Applications place
Data and flops for applications – Especially some of the GriPhyN Apps
Specific machine configurations reflect this
The document discusses grids and their potential use for data mining applications in Earth science. Some key points:
- Grids can connect distributed computing and data resources to enable large-scale applications and collaboration.
- The Grid Miner application was developed to mine satellite data on NASA's Information Power Grid as a demonstration.
- Grids could help couple satellite data archives to computational resources, allowing users to process large datasets.
- For this to be realized, data archives need to be connected to grids and tools developed to enable scientists to access and analyze data.
This case study report presentation provides an overview of grid computing. It defines grid computing and discusses its key building blocks including networks, computational nodes, and common infrastructure standards. The presentation also examines grid computing models like distributed super-computing and data-intensive computing. Challenges of grid computing and examples of applications in fields like life sciences, engineering, and physical sciences are outlined.
Positioning University of California Information Technology for the Future: S...Larry Smarr
05.02.15
Invited Talk
The Vice Chancellor of Research and Chief Information Officer Summit
“Information Technology Enabling Research at the University of California”
Title: Positioning University of California Information Technology for the Future: State, National, and International IT Infrastructure Trends and Directions
Oakland, CA
The Open Science Data Cloud: Empowering the Long Tail of ScienceRobert Grossman
The Open Cloud Consortium operates the Open Science Data Cloud, a not-for-profit cloud computing infrastructure that supports scientific research. The Open Cloud Consortium manages cloud computing testbeds and resources donated by universities, companies, government agencies, and international partners. Its goal is to democratize access to data and computing power for scientific discovery through its Open Science Data Cloud.
Impact of Grid Computing on Network Operators and HW VendorsTal Lavian Ph.D.
The “Network” is a Prime Resource for Large- Scale Distributed System.
Integrated SW System Provide the “Glue”
Dynamic optical network as a fundamental Grid service in data-intensive Grid application, to be scheduled, to be managed and coordinated to support collaborative operations
Large Scale On-Demand Image Processing For Disaster ReliefRobert Grossman
This is a status update (as of Feb 22, 2010) of a new Open Cloud Consortium project that will provide on-demand, large scale image processing to assist with disaster relief efforts.
The document discusses grids and their potential use for data mining applications in Earth science. Some key points:
- Grids can connect distributed computing and data resources to enable large-scale applications and collaboration.
- The Grid Miner application was developed to mine satellite data on NASA's Information Power Grid as a demonstration.
- Grids could help couple satellite data archives to computational resources, allowing users to process large datasets.
- For this to be realized, data archives need to be connected to grids and tools developed to enable scientists to access and analyze data.
This case study report presentation provides an overview of grid computing. It defines grid computing and discusses its key building blocks including networks, computational nodes, and common infrastructure standards. The presentation also examines grid computing models like distributed super-computing and data-intensive computing. Challenges of grid computing and examples of applications in fields like life sciences, engineering, and physical sciences are outlined.
Positioning University of California Information Technology for the Future: S...Larry Smarr
05.02.15
Invited Talk
The Vice Chancellor of Research and Chief Information Officer Summit
“Information Technology Enabling Research at the University of California”
Title: Positioning University of California Information Technology for the Future: State, National, and International IT Infrastructure Trends and Directions
Oakland, CA
The Open Science Data Cloud: Empowering the Long Tail of ScienceRobert Grossman
The Open Cloud Consortium operates the Open Science Data Cloud, a not-for-profit cloud computing infrastructure that supports scientific research. The Open Cloud Consortium manages cloud computing testbeds and resources donated by universities, companies, government agencies, and international partners. Its goal is to democratize access to data and computing power for scientific discovery through its Open Science Data Cloud.
Impact of Grid Computing on Network Operators and HW VendorsTal Lavian Ph.D.
The “Network” is a Prime Resource for Large- Scale Distributed System.
Integrated SW System Provide the “Glue”
Dynamic optical network as a fundamental Grid service in data-intensive Grid application, to be scheduled, to be managed and coordinated to support collaborative operations
Large Scale On-Demand Image Processing For Disaster ReliefRobert Grossman
This is a status update (as of Feb 22, 2010) of a new Open Cloud Consortium project that will provide on-demand, large scale image processing to assist with disaster relief efforts.
This document provides an agenda for a presentation on harnessing the power of big data with Oracle. The agenda includes introductions to big data and market trends, defining big data, an overview of the Oracle Big Data Appliance, Oracle's integrated software solution, and a demonstration. The presentation aims to show how Oracle can help organizations access and analyze large, diverse datasets to drive innovation.
Calit2-a Persistent UCSD/UCI Framework for CollaborationLarry Smarr
05.02.16
Invited Talk
Sun Microsystems Global Education and Research
Conference 2005
Title: Calit2-a Persistent UCSD/UCI Framework for Collaboration
San Francisco, CA
Un cloud pour comparer nos gènes aux images du cerveau" Le pionnier des bases de données, aujourd'hui disparu, Jim Gray avait annoncé en 2007 l'emergence d'un 4eme paradigme scientifique: celui d'une recherche scientifique numérique entierement guidée par l'exploration de données massives. Cette vision est aujourd'hui la réalité de tous les jours dans les laboratoire de recherche scientifique, et elle va bien au delà de ce que l'on appelle communément "BIG DATA". Microsoft Research et Inria on démarré en 2010 un projet intitulé Azure-Brain (ou A-Brain) dont l'originalité consiste à a la fois construire au dessus de Windows Azure une nouvelle plateforme d'acces aux données massives pour les applications scientifiques, et de se confronter à la réalité de la recherche scientifique. Dans cette session nous vous proposons dans une premiere partie de resituer les enjeux recherche concernant la gestion de données massives dans le cloud, et ensuite de vous presenter la plateforme "TOMUS Blob" cloud storage optimisé sur Azure. Enfin nous vous presenterons le projet A-Brain et les résultats que nous avons obtenus: La neuro-imagerie contribue au diagnostic de certaines maladies du système nerveux. Mais nos cerveaux s'avèrent tous un peu différents les uns des autres. Cette variabilité complique l'interprétation médicale. D'où l'idée de corréler ldes images IRM du cerveaux et le patrimoine génétique de chaque patient afin de mieux délimiter les régions cérébrales qui présentent un intérêt symptomatique. Les images IRM haute définition de ce projet sont produites par la plate-forme Neurospin du CEA (Saclay). Problème pour Les chercheurs : la masse d'informations à traiter. Le CV génétique d'un individu comporte environ un million de données. À cela s'ajoutent des volumes tout aussi colossaux de pixel 3D pour décrire les images. Un data deluge: des peta octets de donnés et potentiellement des années de calcul. C'est donc ici qu'entre en jeu le cloud et une plateforme optimisée sur Azure pour traiter des applications massivement parallèles sur des données massives... Comme l'explique Gabriel Antoniu, son responsable, cette équipe de recherche rennaise a développé “des mécanismes de stockage efficaces pour améliorer l'accès à ces données massives et optimiser leur traitement. Nos développements permettent de répondre aux besoins applicatifs de nos collègues de Saclay.
Grid computing involves linking together distributed computer resources from multiple administrative domains to achieve a common goal. Resources in a grid are heterogeneous and geographically dispersed. A grid differs from a cluster in that it provides a consistent, dependable, and transparent collection of computing resources across wide distances. Grid infrastructure must respect local autonomy, handle heterogeneous hardware, and be resilient and dynamic.
Challenges and Issues of Next Cloud Computing PlatformsFrederic Desprez
Cloud computing has now crossed the frontiers of research to reach industry. It is used every day , whether to exchange emails or make
reservations on web sites. However, many research works remain to be done to improve the performance and functionality of these platforms of tomorrow. In this talk, I will do an overview of some these theoretical and appliead researches done at INRIA and particularly around Clouds distribution, energy monitoring and management, massive data processing and exchange, and resource management.
Big Data, Big Computing, AI, and Environmental ScienceIan Foster
I presented to the Environmental Data Science group at UChicago, with the goal of getting them excited about the opportunities inherent in big data, big computing, and AI--and to think about how to collaborate with Argonne in those areas. We had a great and long conversation about Takuya Kurihana's work on unsupervised learning for cloud classification. I also mentioned our work making NASA and CMIP data accessible on AI supercomputers.
Lessons Learned from a Year's Worth of Benchmarking Large Data Clouds (Robert...Robert Grossman
The document summarizes Sector, an open-source large data cloud computing platform, and compares it to Hadoop. Sector uses a file-based storage system instead of Hadoop's block-based HDFS, and features a more flexible UDF programming model compared to MapReduce. Benchmark results show Sector outperforming Hadoop on the Terasort and MalStone benchmarks, with speedups of up to 19x, due to its dataflow balancing, UDP-based transport, and other architectural advantages over Hadoop for data-intensive computing at scale. Lessons learned include the importance of data locality, load balancing, and fault tolerance in large-scale systems.
This document provides an overview of Bionimbus and the Open Cloud Consortium. Bionimbus is a cloud platform for genomic data that provides services for data ingestion, storage, analysis, and sharing. It uses virtual machines and elastic cloud resources to enable large-scale genomic analyses. The Open Cloud Consortium is a non-profit focused on developing open cloud standards and operating shared cloud infrastructures for research. Its goals include bridging public and private clouds and operating large testbeds for cloud evaluation. The Consortium includes universities, companies, and government agencies working on cloud specifications, benchmarks, and shared testbed resources.
Computational science and engineering (CSE) utilizes high performance computing, large-scale simulations, and scientific applications to enable data-driven discovery. The speaker discusses CSE initiatives at UC Berkeley and Lawrence Berkeley National Lab focused on areas like health, freshwater, food security, ecosystems, and urban metabolism using exascale computing and big data analytics.
This document summarizes a seminar presentation on big data analytics. It reviews 25 research papers published between 2011-2014 on issues related to big data analysis, real-time big data analysis using Hadoop in cloud computing, and classification of big data using tools and frameworks. The review process involved a 5-stage analysis of the papers. Key issues identified include big data analysis, real-time analysis using Hadoop in clouds, and classification using tools like Hadoop, MapReduce, HDFS. Promising solutions discussed are MapReduce Agent Mobility framework, PuntStore with pLSM index, IOT-StatisticDB statistical database mechanism, and visual clustering analysis.
Distributed Cyberinfrastructure to Support Big Data Machine LearningLarry Smarr
Panel on the Future of Machine Learning
California Institute for Telecommunications and Information Technology
University of California, Irvine
May 24, 2018
Science and Cyberinfrastructure in the Data-Dominated EraLarry Smarr
10.02.22
Invited talk
Symposium #1610, How Computational Science Is Tackling the Grand Challenges Facing Science and Society
Title: Science and Cyberinfrastructure in the Data-Dominated Era
San Diego, CA
This document summarizes a lecture on trends and paradigm shifts in computational fluid dynamics (CFD). The summary is:
Market forces are driving hardware away from scientific computing toward lower-precision, highly parallel architectures. This will impact how CFD is performed in the future. As computing power increases, databases of detailed simulations may allow machine learning to provide efficient reduced-order models to replace human modeling for some problems. Examples discussed include reactive flows and non-equilibrium turbulence modeling using coherent structure dynamics equations. The talk also addressed how hardware trends may affect CFD and discussed using machine learning and genetic algorithms to optimize reduced-order models.
The growth of internet of things and wireless technology has led to enormous generation of data for various application uses such as healthcare, scientific and data intensive application. Cloud based Storage Area Network (SAN) has been widely in recent time for storing and processing these data. Providing fault tolerant and continuous access to data with minimal latency and cost is challenging. For that efficient fault tolerant mechanism is required. Data replication is an efficient mechanism for providing fault tolerant mechanism that has been considered by exiting methodologies. However, data replica placement is challenging and existing method are not efficient considering application dynamic requirement of cloud based storage area network. Thus, incurring latency, due to which induce higher cost of data transmission. This work present an efficient replica placement and transmission technique using Bipartite Graph based Data Replica Placement (BGDRP) technique that aid in minimizing latency and computing cost. Performance of BGDRP is evaluated using real-time scientific application workflow. The outcome shows BGDRP technique minimize data access latency, computation time and cost over state-of-art technique.
Analyzing Large Earth Data Sets: New Tools from the OptiPuter and LOOKING Pro...Larry Smarr
The document discusses two projects, OptIPuter and LOOKING, that aim to analyze large earth data sets using optical networking and grid technologies. OptIPuter extends grid middleware to dedicated optical circuits for earth and medical sciences. LOOKING builds on OptIPuter to provide real-time control of ocean observatories through web and grid services integrated over optical networks. Both projects represent efforts to develop cyberinfrastructure for interactive analysis of remote earth science data and instruments.
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facilityinside-BigData.com
In this deck from the Swiss HPC Conference, Mark Wilkinson presents: 40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility.
"DiRAC is the integrated supercomputing facility for theoretical modeling and HPC-based research in particle physics, and astrophysics, cosmology, and nuclear physics, all areas in which the UK is world-leading. DiRAC provides a variety of compute resources, matching machine architecture to the algorithm design and requirements of the research problems to be solved. As a single federated Facility, DiRAC allows more effective and efficient use of computing resources, supporting the delivery of the science programs across the STFC research communities. It provides a common training and consultation framework and, crucially, provides critical mass and a coordinating structure for both small- and large-scale cross-discipline science projects, the technical support needed to run and develop a distributed HPC service, and a pool of expertise to support knowledge transfer and industrial partnership projects. The on-going development and sharing of best-practice for the delivery of productive, national HPC services with DiRAC enables STFC researchers to produce world-leading science across the entire STFC science theory program."
Watch the video: https://wp.me/p3RLHQ-k94
Learn more: https://dirac.ac.uk/
and
http://hpcadvisorycouncil.com/events/2019/swiss-workshop/agenda.php
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
The document discusses accelerating science discovery with AI inference-as-a-service. It describes showcases using this approach for high energy physics and gravitational wave experiments. It outlines the vision of the A3D3 institute to unite domain scientists, computer scientists, and engineers to achieve real-time AI and transform science. Examples are provided of using AI inference-as-a-service to accelerate workflows for CMS, ProtoDUNE, LIGO, and other experiments.
This document provides an agenda for a presentation on harnessing the power of big data with Oracle. The agenda includes introductions to big data and market trends, defining big data, an overview of the Oracle Big Data Appliance, Oracle's integrated software solution, and a demonstration. The presentation aims to show how Oracle can help organizations access and analyze large, diverse datasets to drive innovation.
Calit2-a Persistent UCSD/UCI Framework for CollaborationLarry Smarr
05.02.16
Invited Talk
Sun Microsystems Global Education and Research
Conference 2005
Title: Calit2-a Persistent UCSD/UCI Framework for Collaboration
San Francisco, CA
Un cloud pour comparer nos gènes aux images du cerveau" Le pionnier des bases de données, aujourd'hui disparu, Jim Gray avait annoncé en 2007 l'emergence d'un 4eme paradigme scientifique: celui d'une recherche scientifique numérique entierement guidée par l'exploration de données massives. Cette vision est aujourd'hui la réalité de tous les jours dans les laboratoire de recherche scientifique, et elle va bien au delà de ce que l'on appelle communément "BIG DATA". Microsoft Research et Inria on démarré en 2010 un projet intitulé Azure-Brain (ou A-Brain) dont l'originalité consiste à a la fois construire au dessus de Windows Azure une nouvelle plateforme d'acces aux données massives pour les applications scientifiques, et de se confronter à la réalité de la recherche scientifique. Dans cette session nous vous proposons dans une premiere partie de resituer les enjeux recherche concernant la gestion de données massives dans le cloud, et ensuite de vous presenter la plateforme "TOMUS Blob" cloud storage optimisé sur Azure. Enfin nous vous presenterons le projet A-Brain et les résultats que nous avons obtenus: La neuro-imagerie contribue au diagnostic de certaines maladies du système nerveux. Mais nos cerveaux s'avèrent tous un peu différents les uns des autres. Cette variabilité complique l'interprétation médicale. D'où l'idée de corréler ldes images IRM du cerveaux et le patrimoine génétique de chaque patient afin de mieux délimiter les régions cérébrales qui présentent un intérêt symptomatique. Les images IRM haute définition de ce projet sont produites par la plate-forme Neurospin du CEA (Saclay). Problème pour Les chercheurs : la masse d'informations à traiter. Le CV génétique d'un individu comporte environ un million de données. À cela s'ajoutent des volumes tout aussi colossaux de pixel 3D pour décrire les images. Un data deluge: des peta octets de donnés et potentiellement des années de calcul. C'est donc ici qu'entre en jeu le cloud et une plateforme optimisée sur Azure pour traiter des applications massivement parallèles sur des données massives... Comme l'explique Gabriel Antoniu, son responsable, cette équipe de recherche rennaise a développé “des mécanismes de stockage efficaces pour améliorer l'accès à ces données massives et optimiser leur traitement. Nos développements permettent de répondre aux besoins applicatifs de nos collègues de Saclay.
Grid computing involves linking together distributed computer resources from multiple administrative domains to achieve a common goal. Resources in a grid are heterogeneous and geographically dispersed. A grid differs from a cluster in that it provides a consistent, dependable, and transparent collection of computing resources across wide distances. Grid infrastructure must respect local autonomy, handle heterogeneous hardware, and be resilient and dynamic.
Challenges and Issues of Next Cloud Computing PlatformsFrederic Desprez
Cloud computing has now crossed the frontiers of research to reach industry. It is used every day , whether to exchange emails or make
reservations on web sites. However, many research works remain to be done to improve the performance and functionality of these platforms of tomorrow. In this talk, I will do an overview of some these theoretical and appliead researches done at INRIA and particularly around Clouds distribution, energy monitoring and management, massive data processing and exchange, and resource management.
Big Data, Big Computing, AI, and Environmental ScienceIan Foster
I presented to the Environmental Data Science group at UChicago, with the goal of getting them excited about the opportunities inherent in big data, big computing, and AI--and to think about how to collaborate with Argonne in those areas. We had a great and long conversation about Takuya Kurihana's work on unsupervised learning for cloud classification. I also mentioned our work making NASA and CMIP data accessible on AI supercomputers.
Lessons Learned from a Year's Worth of Benchmarking Large Data Clouds (Robert...Robert Grossman
The document summarizes Sector, an open-source large data cloud computing platform, and compares it to Hadoop. Sector uses a file-based storage system instead of Hadoop's block-based HDFS, and features a more flexible UDF programming model compared to MapReduce. Benchmark results show Sector outperforming Hadoop on the Terasort and MalStone benchmarks, with speedups of up to 19x, due to its dataflow balancing, UDP-based transport, and other architectural advantages over Hadoop for data-intensive computing at scale. Lessons learned include the importance of data locality, load balancing, and fault tolerance in large-scale systems.
This document provides an overview of Bionimbus and the Open Cloud Consortium. Bionimbus is a cloud platform for genomic data that provides services for data ingestion, storage, analysis, and sharing. It uses virtual machines and elastic cloud resources to enable large-scale genomic analyses. The Open Cloud Consortium is a non-profit focused on developing open cloud standards and operating shared cloud infrastructures for research. Its goals include bridging public and private clouds and operating large testbeds for cloud evaluation. The Consortium includes universities, companies, and government agencies working on cloud specifications, benchmarks, and shared testbed resources.
Computational science and engineering (CSE) utilizes high performance computing, large-scale simulations, and scientific applications to enable data-driven discovery. The speaker discusses CSE initiatives at UC Berkeley and Lawrence Berkeley National Lab focused on areas like health, freshwater, food security, ecosystems, and urban metabolism using exascale computing and big data analytics.
This document summarizes a seminar presentation on big data analytics. It reviews 25 research papers published between 2011-2014 on issues related to big data analysis, real-time big data analysis using Hadoop in cloud computing, and classification of big data using tools and frameworks. The review process involved a 5-stage analysis of the papers. Key issues identified include big data analysis, real-time analysis using Hadoop in clouds, and classification using tools like Hadoop, MapReduce, HDFS. Promising solutions discussed are MapReduce Agent Mobility framework, PuntStore with pLSM index, IOT-StatisticDB statistical database mechanism, and visual clustering analysis.
Distributed Cyberinfrastructure to Support Big Data Machine LearningLarry Smarr
Panel on the Future of Machine Learning
California Institute for Telecommunications and Information Technology
University of California, Irvine
May 24, 2018
Science and Cyberinfrastructure in the Data-Dominated EraLarry Smarr
10.02.22
Invited talk
Symposium #1610, How Computational Science Is Tackling the Grand Challenges Facing Science and Society
Title: Science and Cyberinfrastructure in the Data-Dominated Era
San Diego, CA
This document summarizes a lecture on trends and paradigm shifts in computational fluid dynamics (CFD). The summary is:
Market forces are driving hardware away from scientific computing toward lower-precision, highly parallel architectures. This will impact how CFD is performed in the future. As computing power increases, databases of detailed simulations may allow machine learning to provide efficient reduced-order models to replace human modeling for some problems. Examples discussed include reactive flows and non-equilibrium turbulence modeling using coherent structure dynamics equations. The talk also addressed how hardware trends may affect CFD and discussed using machine learning and genetic algorithms to optimize reduced-order models.
The growth of internet of things and wireless technology has led to enormous generation of data for various application uses such as healthcare, scientific and data intensive application. Cloud based Storage Area Network (SAN) has been widely in recent time for storing and processing these data. Providing fault tolerant and continuous access to data with minimal latency and cost is challenging. For that efficient fault tolerant mechanism is required. Data replication is an efficient mechanism for providing fault tolerant mechanism that has been considered by exiting methodologies. However, data replica placement is challenging and existing method are not efficient considering application dynamic requirement of cloud based storage area network. Thus, incurring latency, due to which induce higher cost of data transmission. This work present an efficient replica placement and transmission technique using Bipartite Graph based Data Replica Placement (BGDRP) technique that aid in minimizing latency and computing cost. Performance of BGDRP is evaluated using real-time scientific application workflow. The outcome shows BGDRP technique minimize data access latency, computation time and cost over state-of-art technique.
Analyzing Large Earth Data Sets: New Tools from the OptiPuter and LOOKING Pro...Larry Smarr
The document discusses two projects, OptIPuter and LOOKING, that aim to analyze large earth data sets using optical networking and grid technologies. OptIPuter extends grid middleware to dedicated optical circuits for earth and medical sciences. LOOKING builds on OptIPuter to provide real-time control of ocean observatories through web and grid services integrated over optical networks. Both projects represent efforts to develop cyberinfrastructure for interactive analysis of remote earth science data and instruments.
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facilityinside-BigData.com
In this deck from the Swiss HPC Conference, Mark Wilkinson presents: 40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility.
"DiRAC is the integrated supercomputing facility for theoretical modeling and HPC-based research in particle physics, and astrophysics, cosmology, and nuclear physics, all areas in which the UK is world-leading. DiRAC provides a variety of compute resources, matching machine architecture to the algorithm design and requirements of the research problems to be solved. As a single federated Facility, DiRAC allows more effective and efficient use of computing resources, supporting the delivery of the science programs across the STFC research communities. It provides a common training and consultation framework and, crucially, provides critical mass and a coordinating structure for both small- and large-scale cross-discipline science projects, the technical support needed to run and develop a distributed HPC service, and a pool of expertise to support knowledge transfer and industrial partnership projects. The on-going development and sharing of best-practice for the delivery of productive, national HPC services with DiRAC enables STFC researchers to produce world-leading science across the entire STFC science theory program."
Watch the video: https://wp.me/p3RLHQ-k94
Learn more: https://dirac.ac.uk/
and
http://hpcadvisorycouncil.com/events/2019/swiss-workshop/agenda.php
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
The document discusses accelerating science discovery with AI inference-as-a-service. It describes showcases using this approach for high energy physics and gravitational wave experiments. It outlines the vision of the A3D3 institute to unite domain scientists, computer scientists, and engineers to achieve real-time AI and transform science. Examples are provided of using AI inference-as-a-service to accelerate workflows for CMS, ProtoDUNE, LIGO, and other experiments.
Cyberinfrastructure and Applications Overview: Howard University June22marpierc
1) Cyberinfrastructure refers to the combination of computing systems, data storage systems, advanced instruments and data repositories, visualization environments, and people that enable knowledge discovery through integrated multi-scale simulations and analyses.
2) Cloud computing, multicore processors, and Web 2.0 tools are changing the landscape of cyberinfrastructure by providing new approaches to distributed computing and data sharing that emphasize usability, collaboration, and accessibility.
3) Scientific applications are increasingly data-intensive, requiring high-performance computing resources to analyze large datasets from sources like gene sequencers, telescopes, sensors, and web crawlers.
Grid optical network service architecture for data intensive applicationsTal Lavian Ph.D.
Integrated SW System Provide the “Glue”
Dynamic optical network as a fundamental Grid service in data-intensive Grid application, to be scheduled, to be managed and coordinated to support collaborative operations
From Super-computer to Super-network
In the past, computer processors were the fastest part
peripheral bottlenecks
In the future optical networks will be the fastest part
Computer, processor, storage, visualization, and instrumentation - slower "peripherals”
eScience Cyber-infrastructure focuses on computation, storage, data, analysis, Work Flow.
The network is vital for better eScience
Grid computing involves distributing computing resources across a network to tackle large problems. The Worldwide LHC Computing Grid (WLCG) was established to support the Large Hadron Collider (LHC) experiment, which produces around 15 petabytes of data annually. The WLCG uses a four-tiered model, with raw data stored at Tier-0 (CERN), copies distributed to Tier-1 data centers, computational resources provided by Tier-2 centers, and Tier-3 facilities providing additional analysis capabilities. This distributed model has proven effective in supporting the first year of LHC data collection and analysis through globally shared computing resources.
Hpc, grid and cloud computing - the past, present, and future challengeJason Shih
This document discusses trends in high performance computing (HPC), grid computing, and cloud computing. It provides an overview of HPC cluster performance and interconnects. Grid computing enabled large-scale scientific collaboration through infrastructures like EGEE. The LHC requires petascale computing capabilities. Cloud computing hype is discussed alongside observations of performance and virtualization challenges. The future of computing may involve more sophisticated tools and dynamic, small computing elements.
Keynote talk at the International Conference on Supercoming 2009, at IBM Yorktown in New York. This is a major update of a talk first given in New Zealand last January. The abstract follows.
The past decade has seen increasingly ambitious and successful methods for outsourcing computing. Approaches such as utility computing, on-demand computing, grid computing, software as a service, and cloud computing all seek to free computer applications from the limiting confines of a single computer. Software that thus runs "outside the box" can be more powerful (think Google, TeraGrid), dynamic (think Animoto, caBIG), and collaborative (think FaceBook, myExperiment). It can also be cheaper, due to economies of scale in hardware and software. The combination of new functionality and new economics inspires new applications, reduces barriers to entry for application providers, and in general disrupts the computing ecosystem. I discuss the new applications that outside-the-box computing enables, in both business and science, and the hardware and software architectures that make these new applications possible.
The past decade has seen increasingly ambitious and successful methods for outsourcing computing. Approaches such as utility computing, on-demand computing, grid computing, software as a service, and cloud computing all seek to free computer applications from the limiting confines of a single computer. Software that thus runs "outside the box" can be more powerful (think Google, TeraGrid), dynamic (think Animoto, caBIG), and collaborative (think FaceBook, myExperiment). It can also be cheaper, due to economies of scale in hardware and software. The combination of new functionality and new economics inspires new applications, reduces barriers to entry for application providers, and in general disrupts the computing ecosystem. I discuss the new applications that outside-the-box computing enables, in both business and science, and the hardware and software architectures that make these new applications possible.
Opening Keynote Lecture
15th Annual ON*VECTOR International Photonics Workshop
Calit2’s Qualcomm Institute
University of California, San Diego
February 29, 2016
How HPC and large-scale data analytics are transforming experimental scienceinside-BigData.com
In this deck from DataTech19, Debbie Bard from NERSC presents: Supercomputing and the scientist: How HPC and large-scale data analytics are transforming experimental science.
"Debbie Bard leads the Data Science Engagement Group NERSC. NERSC is the mission supercomputing center for the USA Department of Energy, and supports over 7000 scientists and 700 projects with supercomputing needs. A native of the UK, her career spans research in particle physics, cosmology and computing on both sides of the Atlantic. She obtained her PhD at Edinburgh University, and has worked at Imperial College London as well as the Stanford Linear Accelerator Center (SLAC) in the USA, before joining the Data Department at NERSC, where she focuses on data-intensive computing and research, including supercomputing for experimental science and machine learning at scale."
Watch the video: https://wp.me/p3RLHQ-kLV
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
The document discusses high performance cluster computing, including its architecture, systems, applications, and enabling technologies. It provides an overview of cluster computing and classifications of cluster systems. Key components of cluster architecture are discussed, along with representative cluster systems and conclusions. Cluster middleware, resources, and applications are also mentioned.
The Sierra Supercomputer: Science and Technology on a Missioninside-BigData.com
In this deck from the Stanford HPC Conference, Adam Bertsch from LLNL presents: The Sierra Supercomputer: Science and Technology on a Mission.
"LLNL just celebrated its 65th anniversary. Since 1952, the laboratory has been at the forefront of high performance computing. Initially, HPC was used to accelerate the design and testing of the nation's nuclear stockpile. Since the last U.S. nuclear test in 1992, HPC has been used to validate the safety, security, and reliability of stockpile without nuclear testing.
Our next flagship HPC system at LLNL will be called Sierra. A collaboration between multiple government and industry partners, Sierra and its sister system Summit at ORNL, will pave the way towards Exascale computing architectures and predictive capability."
Watch the video: https://wp.me/p3RLHQ-i4K
Learn more: https://computation.llnl.gov/computers/sierra
and
http://hpcadvisorycouncil.com
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
The document discusses the future of high performance computing (HPC). It covers several topics:
- Next generation HPC applications will involve larger problems in fields like disaster simulation, urban science, and data-intensive science. Projects like the Square Kilometer Array will generate exabytes of data daily.
- Hardware trends include using many-core processors, accelerators like GPUs, and heterogeneous computing with CPUs and GPUs. Future exascale systems may use conventional CPUs with GPUs or innovative architectures like Japan's Post-K system.
- The top supercomputers in the world currently include Summit, a IBM system combining Power9 CPUs and Nvidia Voltas at Oak Ridge, and China's Sunway Taihu
The realization of network softwarization, an overarching buzzword to encompass all software-centric developments from the Software-Defined Networking (SDN) and Network Function Virtualization (NFV) trends, is being enabled through a set of innovations in high-speed data plane design and implementation. Recent efforts include te-architecting the hardware-software interfaces and exposing programmatic interfaces (e.g., OpenFlow), programmable hardware-based pipelines (e.g. Protocol Independent Switch Architecture – PISA) along suitabe programming languages (e.g., P4), and multiple advances on low overhead virtualization and fast packet processing libraries (e.g. DPDK, FD.io) for Linux based general purpose processor platforms. This talk provides an overview of relevant ongoing work and discusses the trade-offs of each design and implementation choice of software-defined dataplanes regarding Programmability, Performance, and Portability.
06.07.26
Invited Talk
Cyberinfrastructure for Humanities, Arts, and Social Sciences, A Summer Institute, SDSC
Title: The OptIPuter and Its Applications
La Jolla, CA
1. Building exascale computers requires moving to sub-nanometer scales and steering individual electrons to solve problems more efficiently.
2. Moving data is a major challenge, as moving data off-chip uses 200x more energy than computing with it on-chip.
3. Future computers should optimize for data movement at all levels, from system design to microarchitecture, to minimize energy usage.
Data centers are large physical facilities that house computing infrastructure for enterprises. They provide utilities like power, cooling, security and shelter for servers and storage equipment. Modern data centers are designed with regions and availability zones for fault tolerance, with each zone consisting of one or more data centers within close network proximity. Key challenges for data centers include efficient cooling of equipment, improving energy proportionality as servers are often idle, optimizing resource utilization through virtualization and dynamic allocation, and managing the immense scale of infrastructure and traffic as cloud providers operate millions of servers globally.
Metacomputer Architecture of the Global LambdaGridLarry Smarr
06.01.13
Invited Talk
Department of Computer Science
Donald Bren School of Information and Computer Sciences
Title: Metacomputer Architecture of the Global LambdaGrid
Irvine, CA
Accelerating TensorFlow with RDMA for high-performance deep learningDataWorks Summit
Google’s TensorFlow is one of the most popular deep learning (DL) frameworks. In distributed TensorFlow, gradient updates are a critical step governing the total model training time. These updates incur a massive volume of data transfer over the network.
In this talk, we first present a thorough analysis of the communication patterns in distributed TensorFlow. Then we propose a unified way of achieving high performance through enhancing the gRPC runtime with Remote Direct Memory Access (RDMA) technology on InfiniBand and RoCE. Through our proposed RDMA-gRPC design, TensorFlow only needs to run over the gRPC channel and gets the optimal performance. Our design includes advanced features such as message pipelining, message coalescing, zero-copy transmission, etc. The performance evaluations show that our proposed design can significantly speed up gRPC throughput by up to 1.5x compared to the default gRPC design. By integrating our RDMA-gRPC with TensorFlow, we are able to achieve up to 35% performance improvement for TensorFlow training with CNN models.
Speakers
Dhabaleswar K (DK) Panda, Professor and University Distinguished Scholar, The Ohio State University
Xiaoyi Lu, Research Scientist, The Ohio State University
Similar to TeraGrid Communication and Computation (20)
This document describes an ultra low phase noise frequency synthesizer system for wireless communication applications. The system uses a combination of a fractional-N phase locked loop (PLL), sampling reference PLL, and direct digital synthesizer (DDS). It aims to reduce phase noise and enable higher order modulation schemes for increased data rates. The system comprises a front end module, display, and system on chip with the frequency synthesizer. It provides very low phase deviation of 0.04 degrees through a dual loop design, sampling PLL reference, and high frequency digital components.
A system for providing ultra low phase noise frequency synthesizers using Fractional-N PLL (Phase Lock Loop), Sampling Reference PLL and DDS (Direct Digital Synthesizer). Modern day advanced communication systems comprise frequency synthesizers that provide a frequency output signal to other parts of the transmitter and receiver so as to enable the system to operate at the set frequency band. The performance of the frequency synthesizer determines the performance of the communication link. Current days advanced communication systems comprises single loop Frequency synthesizers which are not completely able to provide lower phase deviations for errors (For 256 QAM the practical phase deviation for no errors is 0.4-0.5°) which would enable users to receive high data rate. This proposed system overcomes deficiencies of current generation state of the art communication systems by providing much lower level of phase deviation error which would result in much higher modulation schemes and high data rate.
Embodiments of the present invention present a method and apparatus for photonic line sharing for high-speed routers. Photonic switches receive high-speed optical data streams and produce the data streams to a router operating according to routing logic and produce optical data streams according to destination addresses stored in the data packets. Each photonic switch can be configured as one of a 1:N multiplexer or an M:N cross-connect switch. In one embodiment, optical data is converted to electrical data prior to routing, while an alternate embodiment routes only optical data. Another embodiment transfers large volumes of high-speed data through an optical bypass line in a circuit switched network to bypass the switch fabric thereby routing the data packets directly to the destination. An edge device selects one of the packet switched network or the circuit switched network. The bypass resources are released when the large volume of high-speed data is transferred.
Systems and methods to support sharing and exchanging in a networkTal Lavian Ph.D.
Embodiments of the invention provide for providing support for sharing and exchanging in a network. The system includes a memory coupled to a processor. The memory includes a database comprising information corresponding to first users and the second users. Each of the first users and the second users are facilitated for sharing or exchanging activity, service or product, based on one or more conditions corresponding thereto. Further, the memory includes one or more instructions executable by the processor to match each of the first users to at least one of the second users. Furthermore, the instructions may inform each of the first users about the match with the at least one of the second users when all the conditions are met by the at least one second user based on the information corresponding to each of the second users.
Systems and methods for visual presentation and selection of IVR menuTal Lavian Ph.D.
Embodiments of the invention provide a system for generating an Interactive Voice Response (IVR) database, the system comprising a processor and a memory coupled to the processor. The memory comprising a list of telephone numbers associated with one or more destinations implementing IVR menus, wherein the one or more destinations are grouped based on a plurality of categories of the IVR menus. Further the memory includes instructions executable by said processor for automatically communicating with the one of more destinations, and receiving at least one customization record from said at least one destination to store in the IVR database.
Various embodiments allow Grid applications to access resources shared in communication network domains. Grid Proxy Architecture for Network Resources (GPAN) bridges Grid services serving user applications and network services controlling network devices through proxy functions. At times, GPAN employs distributed network service peers (NSP) in network domains to discover, negotiate and allocate network resources for Grid applications. An elected master NSP is the unique Grid node that runs GPAN and represents the whole network to share network resources to Grids without Grid involvement of network devices. GPAN provides the Grid Proxy service (GPS) to interface with Grid services and applications, and the Grid Delegation service (GDS) to interface with network services to utilize network resources. In some cases, resource-based XML messaging can be employed for the GPAN proxy communication.
A system for providing ultra low phase noise frequency synthesizers using Fractional-N PLL (Phase Lock Loop), Sampling Reference PLL and DDS (Direct Digital Synthesizer). Modern day advanced communication systems comprise frequency synthesizers that provide a frequency output signal to other parts of the transmitter and receiver so as to enable the system to operate at the set frequency band. The performance of the frequency synthesizer determines the performance of the communication link. Current days advanced communication systems comprises single loop Frequency synthesizers which are not completely able to provide lower phase deviations for errors (For 256 QAM the practical phase deviation for no errors is 0.4-0.5°) which would enable users to receive high data rate. This proposed system overcomes deficiencies of current generation state of the art communication systems by providing much lower level of phase deviation error which would result in much higher modulation schemes and high data rate.
Systems and methods for electronic communicationsTal Lavian Ph.D.
Embodiments of the invention provide a system for enhancing user interaction with the Internet of Things. The system includes a processor, and a memory coupled to the processor. The memory includes a database having one or more options corresponding to each of the Internet of Things. The memory further includes instructions executable by the processor to share at least one of the one or more options with one or more users of the things. Further, the instructions receive information corresponding to selection of the at least one option by the one or more users. Additionally, the instructions update the database based on the selection of the at least one option by the one or more users. Further, a device for enhancing interaction with the things is also disclosed.
A system for providing ultra low phase noise frequency synthesizers using Fractional-N PLL (Phase Lock Loop), Sampling Reference PLL and DDS (Direct Digital Synthesizer). Modern day advanced communication systems comprise frequency synthesizers that provide a frequency output signal to other parts of the transmitter and receiver so as to enable the system to operate at the set frequency band. The performance of the frequency synthesizer determines the performance of the communication link. Current days advanced communication systems comprises single loop Frequency synthesizers which are not completely able to provide lower phase deviations for errors (For 256 QAM the practical phase deviation for no errors is 0.4-0.5°) which would enable users to receive high data rate. This proposed system overcomes deficiencies of current generation state of the art communication systems by providing much lower level of phase deviation error which would result in much higher modulation schemes and high data rate.
A system for providing ultra low phase noise frequency synthesizers using Fractional-N PLL (Phase Lock Loop), Sampling Reference PLL and DDS (Direct Digital Synthesizer). Modern day advanced communication systems comprise frequency synthesizers that provide a frequency output signal to other parts of the transmitter and receiver so as to enable the system to operate at the set frequency band. The performance of the frequency synthesizer determines the performance of the communication link. Current days advanced communication systems comprises single loop Frequency synthesizers which are not completely able to provide lower phase deviations for errors (For 256 QAM the practical phase deviation for no errors is 0.4-0.5°) which would enable users to receive high data rate. This proposed system overcomes deficiencies of current generation state of the art communication systems by providing much lower level of phase deviation error which would result in much higher modulation schemes and high data rate.
Radar target detection system for autonomous vehicles with ultra-low phase no...Tal Lavian Ph.D.
An object detection system for autonomous vehicle, comprising a radar unit and at least one ultra-low phase noise frequency synthesizer, is provided. The radar unit configured for detecting the presence and characteristics of one or more objects in various directions. The radar unit may include a transmitter for transmitting at least one radio signal; and a receiver for receiving the at least one radio signal returned from the one or more objects. The ultra-low phase noise frequency synthesizer may utilize Clocking device, Sampling Reference PLL, at least one fixed frequency divider, DDS and main PLL to reduce phase noise from the returned radio signal. This proposed system overcomes deficiencies of current generation state of the art Radar Systems by providing much lower level of phase noise which would result in improved performance of the radar system in terms of target detection, characterization etc. Further, a method for autonomous vehicle is also disclosed.
Various embodiments allow Grid applications to access resources shared in communication network domains. Grid Proxy Architecture for Network Resources (GPAN) bridges Grid services serving user applications and network services controlling network devices through proxy functions. At times, GPAN employs distributed network service peers (NSP) in network domains to discover, negotiate and allocate network resources for Grid applications. An elected master NSP is the unique Grid node that runs GPAN and represents the whole network to share network resources to Grids without Grid involvement of network devices. GPAN provides the Grid Proxy service (GPS) to interface with Grid services and applications, and the Grid Delegation service (GDS) to interface with network services to utilize network resources. In some cases, resource-based XML messaging can be employed for the GPAN proxy communication.
Method and apparatus for scheduling resources on a switched underlay networkTal Lavian Ph.D.
A method and apparatus for resource scheduling on a switched underlay network (18) enables coordination, scheduling, and scheduling optimization to take place taking into account the availability of the data and the network resources comprising the switched underlay network (18). Requested transfers may be fulfilled by assessing the requested transfer parameters, the availability of the network resources required to fulfill the request, the availability of the data to be transferred, the availability of sufficient storage resources to receive the data, and other potentially conflicting requested transfers. In one embodiment, the requests are under-constrained to enable transfer scheduling optimization to occur. The under-constrained nature of the requests enable transfer scheduling optimization to occur. The under-constrained nature of the requests enables requests to be scheduled taking into account factors such as transfer priority, transfer duration, the amount of time it has been since the transfer request was submitted, and many other factors.
Dynamic assignment of traffic classes to a priority queue in a packet forward...Tal Lavian Ph.D.
An apparatus and method for dynamic assignment of classes of traffic to a priority queue. Bandwidth consumption by one or more types of packet traffic received in the packet forwarding device is monitored to determine whether the bandwidth consumption exceeds a threshold. If the bandwidth consumption exceeds the threshold, assignment of at least one type of packet traffic of the one or more types of packet traffic is changed from a queue having a first priority to a queue having a second priority.
Method and apparatus for using a command design pattern to access and configu...Tal Lavian Ph.D.
This patent application describes a method and system for remotely accessing and configuring network devices using XML documents and a common design pattern. An XML request is sent from a client to a network device to request that a service be performed locally on the device. The network device includes a service engine that can parse the XML request using an XML DTD, instantiate the requested service, interact with device hardware and software to execute the service, and optionally return a response to the client. The use of XML documents and a common design pattern allows network devices to be accessed and configured in a flexible manner without needing to be pre-programmed for specific requests.
Embodiments of the invention provide means to the users of the system to provide ratings and corresponding feedback for enhancing the genuineness in the ratings. The system includes a memory coupled to a processor. The memory includes one or more instructions executable by the processor to enable the users of the system to rate each other based on at least one of sharing, exchanging, and selling one of activity, service or product. The system may provide a mechanism to encourage genuineness in ratings provided by the users. Furthermore, the instructions facilitate the rating receivers to provide feedbacks corresponding to the received ratings. The feedback includes accepting or objecting to a particular rating. Moreover, the memory includes instructions executable by the processor to enable the system to determine genuineness of an objection raised by a rating receiver.
Embodiments of the present invention provide a system for enhancing reliability in computation of ratings provided by a user over a social network. The system comprises of a processor and a memory coupled to the processor. The memory further comprises a rater score database, a satisfaction database, a social network registration database, a user profile database, and a plurality of instruction executable by the processor. Said instructions in the memory are enabled to accept a message from at least one user wherein said message comprises a satisfaction score associated with at least one service provider and to retrieve a rater score associated with said at least one user from said rater score database. Further, the memory includes instructions in order to compute a new satisfaction score based on said rater score and said satisfaction score and update said satisfaction database to include said new satisfaction score. In a similar manner, the new satisfaction score can be computed based upon the information stored in the social network registration database and user profile database.
Systems and methods for visual presentation and selection of ivr menuTal Lavian Ph.D.
Embodiments of the invention provide a system for generating an Interactive Voice Response (IVR) database, the system comprising a processor and a memory coupled to the processor. The memory comprising a list of telephone numbers associated with one or more destinations implementing IVR menus, wherein the one or more destinations are grouped based on a plurality of categories of the IVR menus. Further the memory includes instructions executable by said processor for automatically communicating with the one of more destinations, and receiving at least one customization record from said at least one destination to store in the IVR database.
A system for providing ultra low phase noise frequency synthesizers using Fractional-N PLL (Phase Lock Loop), Sampling Reference PLL and DDS (Direct Digital Synthesizer). Modern day advanced communication systems comprise frequency synthesizers that provide a frequency output signal to other parts of the transmitter and receiver so as to enable the system to operate at the set frequency band. The performance of the frequency synthesizer determines the performance of the communication link. Current days advanced communication systems comprises single loop Frequency synthesizers which are not completely able to provide lower phase deviations for errors (For 256 QAM the practical phase deviation for no errors is 0.4-0.5°) which would enable users to receive high data rate. This proposed system overcomes deficiencies of current generation state of the art communication systems by providing much lower level of phase deviation error which would result in much higher modulation schemes and high data rate.
A system for providing ultra low phase noise frequency synthesizers using Fractional-N PLL (Phase Lock Loop), Sampling Reference PLL and DDS (Direct Digital Synthesizer). Modern day advanced communication systems comprise frequency synthesizers that provide a frequency output signal to other parts of the transmitter and receiver so as to enable the system to operate at the set frequency band. The performance of the frequency synthesizer determines the performance of the communication link. Current days advanced communication systems comprises single loop Frequency synthesizers which are not completely able to provide lower phase deviations for errors (For 256 QAM the practical phase deviation for no errors is 0.4-0.5°) which would enable users to receive high data rate. This proposed system overcomes deficiencies of current generation state of the art communication systems by providing much lower level of phase deviation error which would result in much higher modulation schemes and high data rate.
Building a Raspberry Pi Robot with Dot NET 8, Blazor and SignalRPeter Gallagher
In this session delivered at NDC Oslo 2024, I talk about how you can control a 3D printed Robot Arm with a Raspberry Pi, .NET 8, Blazor and SignalR.
I also show how you can use a Unity app on an Meta Quest 3 to control the arm VR too.
You can find the GitHub repo and workshop instructions here;
https://bit.ly/dotnetrobotgithub
1. TeraGrid Communication and
Computation
Tal Lavian
tlavian@cs.berkeley.edu
Brainstorm and concepts
Many slides and most of the graphics are taken from other slides
TeraGrid Comm & Comp Slide: 1
2. Agenda
Introduction
Some applications
TeraGrid Architecture
Globus toolkit
Future comm direction
Summary
TeraGrid Comm Comp Slide: 2
3. The Grid Problem
Resource sharing coordinated problem
solving in dynamic, multi-institutional virtual
organizations
Some relation to Sahara
Service composition: computation, servers, storage, disk, network…
Sharing, cooperating, peering, brokering…
TeraGrid Comm Comp Slide: 3
4. TeraGrid Wide Area Network -
NCSA, ANL, SDSC, Caltech
UIC
StarLight
International Optical Peering Point
(see www.startap.net)
Starlight / NW Univ
Multiple Carrier Hubs
Ill Inst of Tech
Univ of Chicago Indianapolis
NCSA/UIUC
DTF Backplane (4xl: 40 Gbps)
I-WIRE
ANL
(Abilene NOC)
TeraGrid Comm Comp Slide: 4
Los Angeles
San Diego
Abilene
Chicago
Indianapolis
Urbana
OC-48 (2.5 Gb/s, Abilene)
Multiple 10 GbE (Qwest)
Multiple 10 GbE (I-WIRE Dark Fiber)
• Solid lines in place and/or available by October 2001
• Dashed I-WIRE lines planned for summer 2002
Source: Charlie Catlett, Argonne
5. The 13.6 TF TeraGrid:
Computing at 40 Gb/s
Site Resources Site Resources
External
Networks
External
External
Networks
Caltech Argonne
External Networks
Networks
Site Resources Site Resources NCSA/PACI
8 TF
240 TB
SDSC
4.1 TF
225 TB
TeraGrid Comm Comp Slide: 5
26
24
8
4 HPSS
5
HPSS
HPSS UniTree
TeraGrid/DTF: NCSA, SDSC, Caltech, Argonne www.teragrid.org
6. 4 TeraGrid Sites Have Focal Points
SDSC – The Data Place
Large-scale and high-performance data analysis/handling
Every Cluster Node is Directly Attached to SAN
NCSA – The Compute Place
Large-scale, Large Flops computation
Argonne – The Viz place
TeraGrid Comm Comp Slide: 6
Scalable Viz walls
Caltech – The Applications place
Data and flops for applications – Especially some of the GriPhyN
Apps
Specific machine configurations reflect this
7. TeraGrid building blocks
Distributed, multisite facility
single site and “Grid enabled” capabilities
uniform compute node selection and interconnect networks at 4 sites
central “Grid Operations Center”
at least one 5+ teraflop site and newer generation processors
SDSC at 4+ TF, NCSA at 6.1-8 TF with McKinley processors
at least one additional site coupled with the first
four core sites: SDSC, NCSA, ANL, and Caltech
Ultra high-speed networks (Static configured)
TeraGrid Comm Comp Slide: 7
multiple gigabits/second
modular 40 Gb/s backbone (4 x 10 GbE)
Remote visualization
data from one site visualized at another
high-performance commodity rendering and visualization system
Argonne hardware visualization support
data serving facilities and visualization displays
NSF - $53M award in August 2001
8. Agenda
Introduction
Some applications
TeraGrid Architecture
Globus toolkit
Future comm direction
Summary
TeraGrid Comm Comp Slide: 10
9. What applications are being targeted
for Grid-enabled computing?
Traditional
Quantum Chromodynamics
Biomolecular Dynamics
Weather Forecasting
Cosmological Dark Matter
Biomolecular Electrostatics
Electric and Magnetic Molecular Properties
TeraGrid Comm Comp Slide: 11
10. Multi-disciplinary Simulations: Aviation Safety
•Lift Capabilities
•Drag Capabilities
•Responsiveness
•Deflection capabilities
•Responsiveness
Crew Capabilities
- accuracy
- perception
- stamina
- re-action times
- SOP’s Engine Models
•Thrust performance
•Reverse Thrust performance
•Responsiveness
•Fuel Consumption
•Braking performance
•Steering capabilities
•Traction
•Dampening capabilities
TeraGrid Comm Comp Slide: 13
Airframe Models
Wing Models
Landing Gear Models
Stabilizer Models
Human Models
Source NASA
Whole system simulations are produced by coupling all of the sub-system simulations
11. New Results Possible on TeraGrid
Biomedical Informatics Research
Network (National Inst. Of Health):
Evolving reference set of brains provides
essential data for developing therapies for
neurological disorders (Multiple Sclerosis,
Alzheimer’s, etc.).
TeraGrid Comm Comp Slide: 14
Pre-TeraGrid:
One lab
Small patient base
4 TB collection
Post-TeraGrid:
Tens of collaborating labs
Larger population sample
400 TB data collection: more brains, higher
resolution
Multiple scale data integration and analysis
12. Grid Communities Applications:
Data Grids for High Energy Physics
There is a “bunch crossing” every 25 nsecs.
There are 100 “triggers” per second
Each triggered event is ~1 MByte in size
France Regional FermiLab ~4 TIPS
Tier2 Centre
~1 TIPS
Online System
Offline Processor Farm
~20 TIPS
CERN Computer Centre
Tier2 Centre
~1 TIPS
TeraGrid Comm Comp Slide: 15
Centre
Italy Regional
Centre
Germany Regional
Centre
InstituteI nstitute Institute Institute
~0.25TIPS
Physicist workstations
~100 MBytes/sec
~100 MBytes/sec
~622 Mbits/sec
~1 MBytes/sec
Physicists work on analysis “channels”.
Each institute will have ~10 physicists working on one or more
channels; data for these channels should be cached by the
institute server
Physics data cache
~PBytes/sec
~622 Mbits/sec
or Air Freight
(deprecated)
Tier2 Centre
~1 TIPS
Tier2 Centre
~1 TIPS
Caltech
~1 TIPS
~622 Mbits/sec
TTiieerr 00
TTiieerr 11
TTiieerr 22
TTiieerr 44
1 TIPS is approximately 25,000
SpecInt95 equivalents
Source Harvey Newman, Caltech
13. Agenda
Introduction
Some applications
TeraGrid Architecture
Globus toolkit
Future comm direction
Summary
TeraGrid Comm Comp Slide: 16
14. Grid Computing Concept
New applications enabled by the coordinated
use of geographically distributed resources
E.g., distributed collaboration, data access and analysis,
distributed computing
Persistent infrastructure for Grid computing
E.g., certificate authorities and policies, protocols for resource
discovery/access
Original motivation, and support, from high-end
science and engineering; but has wide-ranging
applicability
TeraGrid Comm Comp Slide: 17
15. Globus Hourglass
Focus on architecture issues
Propose set of core services as basic
infrastructure
Use to construct high-level, domain-specific
solutions
Design principles
Keep participation cost low
Enable local control
Support for adaptation
“IP hourglass” model
A p p l i c a t i o n s
Diverse global services
Core Globus
services
Local OS
16. Elements of the Problem
Resource sharing
Computers, storage, sensors, networks, …
Sharing always conditional: issues of trust, policy, negotiation,
payment, …
Coordinated problem solving
Beyond client-server: distributed data analysis, computation,
collaboration, …
Dynamic, multi-institutional virtual orgs
Community overlays on classic org structures
Large or small, static or dynamic
TeraGrid Comm Comp Slide: 19
17. Gilder vs. Moore – Impact on the Future
of Computing
10x every 5 years
WAN/MAN Bandwidth
10x
TeraGrid Comm Comp Slide: 20
LLoogg GGrroowwtthh
Processor
Performance
1M
10,000
100
2x/9 mo
2x/18 mo
1995 1997 1999 2001 2003 2005 2007
18. Improvements in Large-Area Networks
Network vs. computer performance
Computer speed doubles every 18 months
Network speed doubles every 9 months
Difference = order of magnitude per 5 years
TeraGrid Comm Comp Slide: 21
1986 to 2000
Computers: x 500
Networks: x 340,000
2001 to 2010
Computers: x 60
Networks: x 4000
Moore’s Law vs. storage improvements vs. optical improvements. Graph from Scientific American (Jan-
2001) by Cleo Vilett, source Vined Khoslan, Kleiner, Caufield and Perkins.
19. Evolving Role of Optical Layer
WDM
OC-48
8l
OC-192c
OC-12c
TeraGrid Comm Comp Slide: 22
10 Gb/s transport line rate
TDM
Service interface rates
equal
transport line rates
85 90 95 2000 Year
135 Mb/s
565 Mb/s
1.7 Gb/s
4l
2l
Transport system capacity
10
10
10
10
10
10
6
5
4
3
2
1
Data: LAN standards
Ethernet
Fast
Ethernet
Gb
Ethernet
10 Gb
Ethernet
Data: Internet backbone
T3
T1
OC-3c
OC-48c
Capacity OC-192
32l
(Mb/s)
160l
Source: IBM WDM research
20. Scientific Software Infrastructure
One of the Major Software
Challenges
Peak Performance is skyrocketing (more than Moore’s Law)
TeraGrid Comm Comp Slide: 23
but ...
Efficiency has declined from 40-50% on the vector
supercomputers of 1990s to as little as 5-10% on parallel
supercomputers of today and may decrease further on future
machines
Research challenge is software
Scientific codes to model and simulate physical processes and
systems
Computing and mathematics software to enable use of
advanced computers for scientific applications
Continuing challenge as computer architectures undergo
fundamental changes: Algorithms that scale to thousands-millions
processors
21. Agenda
Introduction
Some applications
TeraGrid Architecture
Globus toolkit
Future comm direction
Summary
TeraGrid Comm Comp Slide: 24
22. Globus Approach
A toolkit and collection of services addressing
key technical problems
Modular “bag of services” model
Not a vertically integrated solution
General infrastructure tools (aka middleware) that can be applied
to many application domains
Inter-domain issues, rather than clustering
Integration of intra-domain solutions
Distinguish between local and global services
TeraGrid Comm Comp Slide: 25
23. Globus Technical Focus Approach
Enable incremental development of grid-enabled
tools and applications
Model neutral: Support many programming models, languages,
tools, and applications
Evolve in response to user requirements
Deploy toolkit on international-scale
production grids and testbeds
Large-scale application development testing
Information-rich environment
Basis for configuration and adaptation
TeraGrid Comm Comp Slide: 26
24. Layered Grid Architecture
(By Analogy to Internet Architecture)
Application
Collective
“Coordinating multiple resources”:
ubiquitous infrastructure services,
app-specific distributed services
“Sharing single resources”: Resource
negotiating access, controlling use
“Talking to things”: communication Connectivity
(Internet protocols) security
“Controlling things locally”: Access Fabric
to, control of, resources
Application
Transport
Internet
Link
Internet Protocol Architecture
TeraGrid Comm Comp Slide: 27
For more info: www.globus.org/research/papers/anatomy.pdf
25. Globus Architecture?
No “official” standards exist
But:
Globus Toolkit has emerged as the de facto standard for several
important Connectivity, Resource, and Collective protocols
Technical specifications are being developed for architecture
elements: e.g., security, data, resource management, information
TeraGrid Comm Comp Slide: 28
26. Agenda
Introduction
Some applications
TeraGrid Architecture
Globus toolkit
Future comm direction
Summary
TeraGrid Comm Comp Slide: 29
27. Static lightpath setting
NCSA, ANL, SDSC, Caltech
DTF Backplane (4xl: 40 Gbps)
TeraGrid Comm Comp Slide: 30
Los Angeles
San Diego
Abilene
Chicago
Indianapolis
Urbana
OC-48 (2.5 Gb/s, Abilene)
Multiple 10 GbE (Qwest)
Multiple 10 GbE (I-WIRE Dark Fiber)
• Solid lines in place and/or available by October 2001
• Dashed I-WIRE lines planned for summer 2002 Source: Charlie Catlett, Argonne
28. Lightpath for OVPN
ASON
Mirror Server
ASON
ASON
Optical Ring
TeraGrid Comm Comp Slide: 31
Lightpath setup
One or two-way
Rates: OC48, OC192 and OC768
QoS constraints
On demand
Aggregation of BW
OVPN
Video
HDTV
OVPN
video
HDTV
Optical fiber and channels
29. Dynamic Lightpath setting
ASON
Mirror Server
ASON
Resource optimization (route 2)
ASON
Optical Ring
main Server
TeraGrid Comm Comp Slide: 32
Alternative lightpath
Route to mirror sites (route 3)
Lightpath setup failed
Load balancing
Long response time
Congestion
Fault
Route 1
Route 2
Route 3
30. C
ONT
R
OL
P
L
ANE
TeraGrid Comm Comp Slide: 33
Apps
Clusters
Dynamically
Allocated
Lightpaths
Switch Fabrics
Physical
Monitoring
Multiple Architectural
Considerations
31. Agenda
Introduction
Some applications
TeraGrid Architecture
Globus toolkit
Future comm direction
Summary
TeraGrid Comm Comp Slide: 34
32. TeraGrid Comm Comp Slide: 35
Summary
The Grid problem: Resource sharing coordinated problem
solving in dynamic, multi-institutional virtual organizations
Grid architecture: Emphasize protocol and service definition
to enable interoperability and resource sharing
Globus Toolkit a source of protocol and API definitions,
reference implementations
Current static communication. Next wave dynamic optical
VPN
Some relation to Sahara
Service composition: computation, servers, storage, disk,
network…
Sharing, cooperating, peering, brokering…
35. Wavelengths and the Future
Wavelength services are causing a network
revolution:
Core long distance SONET Rings will be replaced by meshed networks
using wavelength cross-connects
Re-invention of pre-SONET network architecture
Improved transport infrastructure will exist for
IP/packet services
Electrical/Optical grooming switches will emerge at
edges
Automated Restoration (algorithm/GMPLS driven)
becomes technically feasible.
Operational implementation will take some time
TeraGrid Comm Comp Slide: 38
37. Internet Reality
Data
Center SONET
TeraGrid Comm Comp Slide: 40
SONET
SONET
SONET
DWD
M DWD
M
Access Metro Long Haul Metro Access
38. OVPN on Optical Network
TeraGrid Comm Comp Slide: 41
Light
Path
39. Three networks in The Internet
Long-Haul Core
(WAN)
TeraGrid Comm Comp Slide: 42
Local
Network
(LAN)
ASON
ASON
Metro
Core
Backbone
ASON
PX
Access
OP
E
ASON
OPE
Metro Network
(MAN)
40. Data Transport Connectivity
Packet Switch
TeraGrid Comm Comp Slide: 43
data-optimized
Ethernet
TCP/IP
Network use
LAN
Advantages
Efficient
Simple
Low cost
Disadvantages
Unreliable
Circuit Switch
Voice-oriented
SONET
ATM
Network uses
Metro and Core
Advantages
Reliable
Disadvantages
Complicate
High cost
Efficiency ? Reliability
41. Global Lambda Grid -
Photonic Switched Network
l2
l1
TeraGrid Comm Comp Slide: 44
42. TThhee MMeettrroo BBoottttlleenneecckk
Other Sites
Access Metro
End User Access Metro Core
Ethernet LAN
OC-192
DWDM n x l
DS1
DS3
IP/DATA
1GigE 40G+
LL/FR/ATM
1-40Meg
OC-12
OC-48
OC-192
10G
TeraGrid Comm Comp Slide: 45
Editor's Notes
Explanation of dashed lines:
St. Louis: State of Illinois fiber between St. Louis and Chicago will be connected to I-WIRE by November 2001. This fiber will terminate at 900 Walnut Street-- the Switch Data Services carrier collocation facility in St. Louis (location of St. Louis Gigapop). In Chicago the State of Illinois fiber will connect both directly to Argonne (which will serve as one of the State’s repeater/opto-electronic equipment hubs) and to the James R. Thompson Center downtown. I-WIRE fiber connects the James R. Thompson Center to the Starlight hub via the Qwest POP.
Indianapolis: We are working with Indiana University to obtain fiber either to Urbana or Chicago. We have preliminary pricing and expect at least a pair of fiber to be available by Summer 2002 if not before.
We define Grid architecture in terms of a layered collection of protocols.
Fabric layer includes the protocols and interfaces that provide access to the resources that are being shared, including computers, storage systems, datasets, programs, and networks. This layer is a logical view rather then a physical view. For example, the view of a cluster with a local resource manager is defined by the local resource manger, and not the cluster hardware. Likewise, the fabric provided by a storage system is defined by the file system that is available on that system, not the raw disk or tapes.
The connectivity layer defines core protocols required for Grid-specific network transactions. This layer includes the IP protocol stack (system level application protocols [e.g. DNS, RSVP, Routing], transport and internet layers), as well as core Grid security protocols for authentication and authorization.
Resource layer defines protocols to initiate and control sharing of (local) resources. Services defined at this level are gatekeeper, GRIS, along with some user oriented application protocols from the Internet protocol suite, such as file-transfer.
Collective layer defines protocols that provide system oriented capabilities that are expected to be wide scale in deployment and generic in function. This includes GIIS, bandwidth brokers, resource brokers,….
Application layer defines protocols and services that are parochial in nature, targeted towards a specific application domain or class of applications. These are are are … arrgh