The document discusses a holistic aggregate resource environment project involving IBM Research, Sandia National Labs, Bell Labs, and CMU. The goals of the project include leveraging aggregation as a first-class systems construct, distributing system services throughout supercomputers, and exploring native interconnect utilization. Research topics discussed include offload/acceleration models, right-weight kernels, and topologies.
This document discusses how to create a domain-specific language (DSL) using Ruby. It begins with an introduction to DSLs and examples of external and internal DSLs. It then demonstrates how to build a DSL for configuring Meetup groups by parsing a configuration file using Ruby's instance_eval method. Key points are that instance_eval interprets a string as Ruby code in the context of an object, and using it with a block changes the default receiver inside the block. The document provides sample code for loading the configuration and implementing setter methods to configure the Meetup object.
The document discusses optimizing performance in MapReduce jobs. It covers understanding bottlenecks through metrics and logs, tuning parameters to reduce spills during the map task sort and spill phase like io.sort.mb and io.sort.record.percent, and tips for reducer fetch tuning. The goal is to help developers understand and address bottlenecks in their MapReduce jobs to improve performance.
Hadoop Summit 2012 | Optimizing MapReduce Job PerformanceCloudera, Inc.
Optimizing MapReduce job performance is often seen as something of a black art. In order to maximize performance, developers need to understand the inner workings of the MapReduce execution framework and how they are affected by various configuration parameters and MR design patterns. The talk will illustrate the underlying mechanics of job and task execution, including the map side sort/spill, the shuffle, and the reduce side merge, and then explain how different job configuration parameters and job design strategies affect the performance of these operations. Though the talk will cover internals, it will also provide practical tips, guidelines, and rules of thumb for better job performance. The talk is primarily targeted towards developers directly using the MapReduce API, though will also include some tips for users of higher level frameworks.
This document provides instructions for configuring a single node Hadoop deployment on Ubuntu. It describes installing Java, adding a dedicated Hadoop user, configuring SSH for key-based authentication, disabling IPv6, installing Hadoop, updating environment variables, and configuring Hadoop configuration files including core-site.xml, mapred-site.xml, and hdfs-site.xml. Key steps include setting JAVA_HOME, configuring HDFS directories and ports, and setting hadoop.tmp.dir to the local /app/hadoop/tmp directory.
Apache Pig is a platform for analyzing large datasets that consists of a high-level data flow language called Pig Latin and an infrastructure for evaluating Pig Latin programs. Pig Latin scripts are compiled into sequences of MapReduce jobs that can run on Hadoop for large scale parallel processing. Pig aims to provide a simpler programming model than raw MapReduce while still allowing for optimization and parallelization of queries. Pig programs can be run interactively using the Grunt shell or by specifying a Pig Latin script to execute.
PuppetCamp SEA 1 - Puppet Deployment at OnAppWalter Heck
Wai Keen Woon, CTO CDN Division OnApp Malaysia, gave an interesting overview of what the Puppet architecture at OnApp looks like. The CDN division at OnApp is a large provider of CDN services, and as such makes a very interesting candidate for a case study.
Netty is a Java framework that provides tools for developing high performance and event-driven network applications. It uses non-blocking I/O and zero-copy techniques to minimize overhead and maximize throughput and scalability. Netty provides buffers, codecs, pipelines and handlers that allow building applications as a stack of processing layers. Example applications include a discard server and an HTTP file server that demonstrate Netty's core features and event-driven architecture.
This document discusses how to create a domain-specific language (DSL) using Ruby. It begins with an introduction to DSLs and examples of external and internal DSLs. It then demonstrates how to build a DSL for configuring Meetup groups by parsing a configuration file using Ruby's instance_eval method. Key points are that instance_eval interprets a string as Ruby code in the context of an object, and using it with a block changes the default receiver inside the block. The document provides sample code for loading the configuration and implementing setter methods to configure the Meetup object.
The document discusses optimizing performance in MapReduce jobs. It covers understanding bottlenecks through metrics and logs, tuning parameters to reduce spills during the map task sort and spill phase like io.sort.mb and io.sort.record.percent, and tips for reducer fetch tuning. The goal is to help developers understand and address bottlenecks in their MapReduce jobs to improve performance.
Hadoop Summit 2012 | Optimizing MapReduce Job PerformanceCloudera, Inc.
Optimizing MapReduce job performance is often seen as something of a black art. In order to maximize performance, developers need to understand the inner workings of the MapReduce execution framework and how they are affected by various configuration parameters and MR design patterns. The talk will illustrate the underlying mechanics of job and task execution, including the map side sort/spill, the shuffle, and the reduce side merge, and then explain how different job configuration parameters and job design strategies affect the performance of these operations. Though the talk will cover internals, it will also provide practical tips, guidelines, and rules of thumb for better job performance. The talk is primarily targeted towards developers directly using the MapReduce API, though will also include some tips for users of higher level frameworks.
This document provides instructions for configuring a single node Hadoop deployment on Ubuntu. It describes installing Java, adding a dedicated Hadoop user, configuring SSH for key-based authentication, disabling IPv6, installing Hadoop, updating environment variables, and configuring Hadoop configuration files including core-site.xml, mapred-site.xml, and hdfs-site.xml. Key steps include setting JAVA_HOME, configuring HDFS directories and ports, and setting hadoop.tmp.dir to the local /app/hadoop/tmp directory.
Apache Pig is a platform for analyzing large datasets that consists of a high-level data flow language called Pig Latin and an infrastructure for evaluating Pig Latin programs. Pig Latin scripts are compiled into sequences of MapReduce jobs that can run on Hadoop for large scale parallel processing. Pig aims to provide a simpler programming model than raw MapReduce while still allowing for optimization and parallelization of queries. Pig programs can be run interactively using the Grunt shell or by specifying a Pig Latin script to execute.
PuppetCamp SEA 1 - Puppet Deployment at OnAppWalter Heck
Wai Keen Woon, CTO CDN Division OnApp Malaysia, gave an interesting overview of what the Puppet architecture at OnApp looks like. The CDN division at OnApp is a large provider of CDN services, and as such makes a very interesting candidate for a case study.
Netty is a Java framework that provides tools for developing high performance and event-driven network applications. It uses non-blocking I/O and zero-copy techniques to minimize overhead and maximize throughput and scalability. Netty provides buffers, codecs, pipelines and handlers that allow building applications as a stack of processing layers. Example applications include a discard server and an HTTP file server that demonstrate Netty's core features and event-driven architecture.
This document provides an overview of Apache Pig, including:
- Pig provides a higher level of abstraction for data users to access the power of Hadoop without writing Java code.
- The topics to be covered include an introduction to Pig, using Grunt shell, advanced operators, macros, embedding Pig Latin in Python, JSON parsing, XML parsing, UDFs, streaming, custom load/store functions, and performance tips.
- Pig is a platform for executing data flows in parallel on Hadoop using the Pig Latin language, which includes operators for traditional data operations and the ability for users to develop their own functions.
A Network Architecture for the Web of Thingsbenaam
The document proposes a network architecture for integrating physical devices and objects into the web. It involves using gateways to ensure device URLs are available, reachable, stable, and discoverable. Gateways queue requests when devices are asleep, cache responses, and update locations when devices move. The architecture addresses issues like energy efficiency, network constraints, mobility, and discovery. A preliminary version was implemented on Sun SPOT devices to demonstrate integrated physical and web resources.
apache pig performance optimizations talk at apachecon 2010Thejas Nair
Pig provides a high-level language called Pig Latin for analyzing large datasets. It optimizes Pig Latin scripts by restructuring the logical query plan through techniques like predicate pushdown and operator rewriting, and by generating efficient physical execution plans that leverage features like combiners, different join algorithms, and memory management. Future work aims to improve memory usage and allow joins and groups within a single MapReduce job when keys are the same.
The HARE execution model looks at scaling operating systems and runtimes to large supercomputers with thousands to millions of cores. It uses an alternative approach based on the Plan 9 distributed operating system for hardware support, systems infrastructure, and distributed services. Key components include BRASIL, a stripped down Inferno operating system, nompirun for legacy-friendly job launching, and a workload optimized distribution model using multipipes and filters running in a PUSH dataflow model. Central services establish a hierarchical namespace across cluster resources.
Stephen Nguyen a Developer Evangelist for ClusterHQ reviews how volumes work and overviews the benefits of allowing Flocker to orchestrate your Volumes. (video coming soon)
QGIS plugin for parallel processing in terrain analysisRoss McDonald
Art Lembo's presentation on embarrassingly parallel processing with QGIS and pyCUDA for terrain analysis. Given at 6th Scottish QGIS UK user group meeting.
The document provides a summary of various physical design problems Lee Johnson has solved using Tcl scripting in different EDA tool environments over several projects from the late 1990s to recent years. It lists solutions by project, including mesh clock routing flows in Cadence Innovus from 2015-2016, floorplanning and routing scripts for IBM chips from 1997-2015, and scripts addressing problems like pin placement, bus routing, and macro placement for other ASIC projects during the same period. It also provides examples of general utilities developed.
The document discusses Linux networking commands and tools. It provides examples of using ip commands to view and configure network interfaces, routes, neighbors, and rules. It also shows tcpdump for packet capture and nmap for port scanning. Firewalls are configured using iptables to allow traffic from a specific source to a web server port.
The document discusses using the OpenDaylight BGP speaker to handle different types of routes including:
1. Link-state routes from IS-IS or OSPF that are advertised via BGP-LS and used to create a link-state topology.
2. IPv4 and IPv6 routes that are learned and advertised across domains.
3. Flowspec routes that function similar to OpenFlow rules but can leverage the BGP route reflector infrastructure with actions encoded as BGP communities.
The document outlines how to configure the BGP speaker through RESTCONF to handle these different routes and advertise them, and provides demos of using it for BGP-LS/PCEP, advertising IPv4
Gofer is a scalable stateless proxy architecture for DBI that is transport independent, highly configurable, efficient, well tested, scalable, and simple. It consists of a simple request/response protocol, a DBI proxy driver called DBD::Gofer, a request executor module, pluggable transport modules like HTTP, SSH, and Gearman, and an extensible client configuration mechanism. It aims to minimize round trips and supports connection pooling to improve performance and scalability.
Pig is a data flow language that sits on top of Hadoop and allows users to quickly process large volumes of data across many servers simultaneously. It supports relational features like joins, groups, and aggregates, making it well-suited for extract, transform, load (ETL) tasks. Common ETL use cases for Pig include time-sensitive data loads from various sources into databases, and processing multiple data sources to gain insights into customer behavior. While Pig can handle ETL tasks, it is also capable of sampling large datasets for analysis and providing analytical insights beyond basic ETL functions.
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with PrometheusOpenStack Korea Community
This document discusses using Prometheus for open infrastructure and cloud monitoring. It introduces Prometheus as a time series database and monitoring tool. Key features covered include metrics collection, service discovery, graphing, and alerting. The architecture of Prometheus is explained, including scrapping metrics directly or via exporters. A demo of Prometheus and Grafana is proposed to monitor Kubernetes clusters and visualize CPU usage. Alerting configuration and routes in Prometheus and Alertmanager are also summarized.
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...Titus Damaiyanti
1. The document discusses installing Hadoop in single node cluster mode on Ubuntu, including installing Java, configuring SSH, extracting and configuring Hadoop files. Key configuration files like core-site.xml and hdfs-site.xml are edited.
2. Formatting the HDFS namenode clears all data. Hadoop is started using start-all.sh and the jps command checks if daemons are running.
3. The document then moves to discussing running a KMeans clustering MapReduce program on the installed Hadoop framework.
Introduction to Cassandra and CQL for Java developersJulien Anguenot
This talk will provide a high-level overview of Cassandra, the Cassandra Query Language (CQL) and more specifically the DataStax CQL Java driver. This talk will aim to introduce Java developers tools, techniques and best practices for building Java application leveraging the Cassandra database using CQL3.
This document contains a homework assignment for a course on execution environments for distributed computing. It discusses Apache Pig, a platform for analyzing large datasets. The document outlines Pig's data model, programming model, and implementation. Key points include Pig Latin's declarative syntax, support for user-defined functions, and ability to automatically parallelize jobs by compiling Pig Latin into MapReduce programs. The conclusions discuss advantages like flexibility and leveraging Hadoop properties, as well as disadvantages like potential performance overhead. Usage scenarios involve temporal and session analysis on datasets like search logs.
This document provides an overview of Apache Pig, including its data model, relational commands like JOIN and GROUP, and how it is implemented on Hadoop. Pig is a platform for analyzing large datasets that uses a declarative language called Pig Latin. It features a rich nested data model and relational commands that are compiled into MapReduce jobs for parallel processing. The implementation utilizes lazy execution, compiling logical plans on-the-fly into physical MapReduce plans only when needed. While Pig provides an easy parallel programming model, it can incur overhead from compiling Pig Latin and user-defined functions.
Declarative Programming and a form of SDN Miya Kohno
The document discusses declarative programming as it relates to network programmability. It provides examples of declarative versus imperative code and explains key concepts of declarative programming like lack of side effects, referential transparency, and idempotence. It also discusses how declarative programming could be beneficial for networking given its robustness in complex distributed environments but may lack universal computational power. OpenDaylight and ETSI NFV architectures are presented as examples combining declarative and imperative approaches.
The document discusses declarative programming as it relates to network programmability. It provides examples of declarative versus imperative code and explains key concepts of declarative programming like lack of side effects, referential transparency, and idempotence. It also discusses how declarative programming can provide benefits like robustness, scalability, and reusability for network systems, which often operate in uncertain distributed environments. Finally, it outlines some declarative programming approaches being used for network control, orchestration, and automation.
One tool, two fabrics: Ansible and Nexus 9000Joel W. King
Ansible can be used to automate configuration of Cisco Nexus 9000 series switches running either NX-OS or Application Centric Infrastructure (ACI). It allows using YAML files, Jinja templates, and Python modules to provision and manage network infrastructure without relying on CLI commands. The presentation demonstrated using Ansible roles to configure NTP servers and backup settings for an ACI fabric by specifying variables in a CSV file and generating XML configuration files from templates.
NUSE is a library implementation of a network stack in userspace that allows new protocols and implementations to be added more quickly without modifying the kernel. It works by hijacking system calls related to networking at the library level, running the network stack code in a separate execution context using lightweight virtualization, and connecting to the network interface using options like raw sockets, DPDK, or netmap. This approach avoids the slow evolution of making kernel changes and allows network stacks and applications to be updated and deployed more flexibly on a per-application basis.
This document discusses several distributed computing systems:
1) DNS is a distributed system that maps domain names to IP addresses using a hierarchical naming structure and caching DNS servers for efficiency.
2) BOINC is a volunteer computing platform that uses over a million computers worldwide for distributed applications like disease research. It provides incentives and verifies results to prevent cheating.
3) PlanetLab is a research network with over 700 servers globally that allows testing new distributed systems at large scales under realistic conditions. It isolates projects using virtualization and trust relationships.
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...Xavier Llorà
Data-intensive computing has positioned itself as a valuable programming paradigm to efficiently approach problems requiring processing very large volumes of data. This paper presents a pilot study about how to apply the data-intensive computing paradigm to evolutionary computation algorithms. Two representative cases (selectorecombinative genetic algorithms and estimation of distribution algorithms) are presented, analyzed, and discussed. This study shows that equivalent data-intensive computing evolutionary computation algorithms can be easily developed, providing robust and scalable algorithms for the multicore-computing era. Experimental results show how such algorithms scale with the number of available cores without further modification.
This document provides an overview of Apache Pig, including:
- Pig provides a higher level of abstraction for data users to access the power of Hadoop without writing Java code.
- The topics to be covered include an introduction to Pig, using Grunt shell, advanced operators, macros, embedding Pig Latin in Python, JSON parsing, XML parsing, UDFs, streaming, custom load/store functions, and performance tips.
- Pig is a platform for executing data flows in parallel on Hadoop using the Pig Latin language, which includes operators for traditional data operations and the ability for users to develop their own functions.
A Network Architecture for the Web of Thingsbenaam
The document proposes a network architecture for integrating physical devices and objects into the web. It involves using gateways to ensure device URLs are available, reachable, stable, and discoverable. Gateways queue requests when devices are asleep, cache responses, and update locations when devices move. The architecture addresses issues like energy efficiency, network constraints, mobility, and discovery. A preliminary version was implemented on Sun SPOT devices to demonstrate integrated physical and web resources.
apache pig performance optimizations talk at apachecon 2010Thejas Nair
Pig provides a high-level language called Pig Latin for analyzing large datasets. It optimizes Pig Latin scripts by restructuring the logical query plan through techniques like predicate pushdown and operator rewriting, and by generating efficient physical execution plans that leverage features like combiners, different join algorithms, and memory management. Future work aims to improve memory usage and allow joins and groups within a single MapReduce job when keys are the same.
The HARE execution model looks at scaling operating systems and runtimes to large supercomputers with thousands to millions of cores. It uses an alternative approach based on the Plan 9 distributed operating system for hardware support, systems infrastructure, and distributed services. Key components include BRASIL, a stripped down Inferno operating system, nompirun for legacy-friendly job launching, and a workload optimized distribution model using multipipes and filters running in a PUSH dataflow model. Central services establish a hierarchical namespace across cluster resources.
Stephen Nguyen a Developer Evangelist for ClusterHQ reviews how volumes work and overviews the benefits of allowing Flocker to orchestrate your Volumes. (video coming soon)
QGIS plugin for parallel processing in terrain analysisRoss McDonald
Art Lembo's presentation on embarrassingly parallel processing with QGIS and pyCUDA for terrain analysis. Given at 6th Scottish QGIS UK user group meeting.
The document provides a summary of various physical design problems Lee Johnson has solved using Tcl scripting in different EDA tool environments over several projects from the late 1990s to recent years. It lists solutions by project, including mesh clock routing flows in Cadence Innovus from 2015-2016, floorplanning and routing scripts for IBM chips from 1997-2015, and scripts addressing problems like pin placement, bus routing, and macro placement for other ASIC projects during the same period. It also provides examples of general utilities developed.
The document discusses Linux networking commands and tools. It provides examples of using ip commands to view and configure network interfaces, routes, neighbors, and rules. It also shows tcpdump for packet capture and nmap for port scanning. Firewalls are configured using iptables to allow traffic from a specific source to a web server port.
The document discusses using the OpenDaylight BGP speaker to handle different types of routes including:
1. Link-state routes from IS-IS or OSPF that are advertised via BGP-LS and used to create a link-state topology.
2. IPv4 and IPv6 routes that are learned and advertised across domains.
3. Flowspec routes that function similar to OpenFlow rules but can leverage the BGP route reflector infrastructure with actions encoded as BGP communities.
The document outlines how to configure the BGP speaker through RESTCONF to handle these different routes and advertise them, and provides demos of using it for BGP-LS/PCEP, advertising IPv4
Gofer is a scalable stateless proxy architecture for DBI that is transport independent, highly configurable, efficient, well tested, scalable, and simple. It consists of a simple request/response protocol, a DBI proxy driver called DBD::Gofer, a request executor module, pluggable transport modules like HTTP, SSH, and Gearman, and an extensible client configuration mechanism. It aims to minimize round trips and supports connection pooling to improve performance and scalability.
Pig is a data flow language that sits on top of Hadoop and allows users to quickly process large volumes of data across many servers simultaneously. It supports relational features like joins, groups, and aggregates, making it well-suited for extract, transform, load (ETL) tasks. Common ETL use cases for Pig include time-sensitive data loads from various sources into databases, and processing multiple data sources to gain insights into customer behavior. While Pig can handle ETL tasks, it is also capable of sampling large datasets for analysis and providing analytical insights beyond basic ETL functions.
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with PrometheusOpenStack Korea Community
This document discusses using Prometheus for open infrastructure and cloud monitoring. It introduces Prometheus as a time series database and monitoring tool. Key features covered include metrics collection, service discovery, graphing, and alerting. The architecture of Prometheus is explained, including scrapping metrics directly or via exporters. A demo of Prometheus and Grafana is proposed to monitor Kubernetes clusters and visualize CPU usage. Alerting configuration and routes in Prometheus and Alertmanager are also summarized.
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...Titus Damaiyanti
1. The document discusses installing Hadoop in single node cluster mode on Ubuntu, including installing Java, configuring SSH, extracting and configuring Hadoop files. Key configuration files like core-site.xml and hdfs-site.xml are edited.
2. Formatting the HDFS namenode clears all data. Hadoop is started using start-all.sh and the jps command checks if daemons are running.
3. The document then moves to discussing running a KMeans clustering MapReduce program on the installed Hadoop framework.
Introduction to Cassandra and CQL for Java developersJulien Anguenot
This talk will provide a high-level overview of Cassandra, the Cassandra Query Language (CQL) and more specifically the DataStax CQL Java driver. This talk will aim to introduce Java developers tools, techniques and best practices for building Java application leveraging the Cassandra database using CQL3.
This document contains a homework assignment for a course on execution environments for distributed computing. It discusses Apache Pig, a platform for analyzing large datasets. The document outlines Pig's data model, programming model, and implementation. Key points include Pig Latin's declarative syntax, support for user-defined functions, and ability to automatically parallelize jobs by compiling Pig Latin into MapReduce programs. The conclusions discuss advantages like flexibility and leveraging Hadoop properties, as well as disadvantages like potential performance overhead. Usage scenarios involve temporal and session analysis on datasets like search logs.
This document provides an overview of Apache Pig, including its data model, relational commands like JOIN and GROUP, and how it is implemented on Hadoop. Pig is a platform for analyzing large datasets that uses a declarative language called Pig Latin. It features a rich nested data model and relational commands that are compiled into MapReduce jobs for parallel processing. The implementation utilizes lazy execution, compiling logical plans on-the-fly into physical MapReduce plans only when needed. While Pig provides an easy parallel programming model, it can incur overhead from compiling Pig Latin and user-defined functions.
Declarative Programming and a form of SDN Miya Kohno
The document discusses declarative programming as it relates to network programmability. It provides examples of declarative versus imperative code and explains key concepts of declarative programming like lack of side effects, referential transparency, and idempotence. It also discusses how declarative programming could be beneficial for networking given its robustness in complex distributed environments but may lack universal computational power. OpenDaylight and ETSI NFV architectures are presented as examples combining declarative and imperative approaches.
The document discusses declarative programming as it relates to network programmability. It provides examples of declarative versus imperative code and explains key concepts of declarative programming like lack of side effects, referential transparency, and idempotence. It also discusses how declarative programming can provide benefits like robustness, scalability, and reusability for network systems, which often operate in uncertain distributed environments. Finally, it outlines some declarative programming approaches being used for network control, orchestration, and automation.
One tool, two fabrics: Ansible and Nexus 9000Joel W. King
Ansible can be used to automate configuration of Cisco Nexus 9000 series switches running either NX-OS or Application Centric Infrastructure (ACI). It allows using YAML files, Jinja templates, and Python modules to provision and manage network infrastructure without relying on CLI commands. The presentation demonstrated using Ansible roles to configure NTP servers and backup settings for an ACI fabric by specifying variables in a CSV file and generating XML configuration files from templates.
NUSE is a library implementation of a network stack in userspace that allows new protocols and implementations to be added more quickly without modifying the kernel. It works by hijacking system calls related to networking at the library level, running the network stack code in a separate execution context using lightweight virtualization, and connecting to the network interface using options like raw sockets, DPDK, or netmap. This approach avoids the slow evolution of making kernel changes and allows network stacks and applications to be updated and deployed more flexibly on a per-application basis.
This document discusses several distributed computing systems:
1) DNS is a distributed system that maps domain names to IP addresses using a hierarchical naming structure and caching DNS servers for efficiency.
2) BOINC is a volunteer computing platform that uses over a million computers worldwide for distributed applications like disease research. It provides incentives and verifies results to prevent cheating.
3) PlanetLab is a research network with over 700 servers globally that allows testing new distributed systems at large scales under realistic conditions. It isolates projects using virtualization and trust relationships.
Data-Intensive Computing for Competent Genetic Algorithms: A Pilot Study us...Xavier Llorà
Data-intensive computing has positioned itself as a valuable programming paradigm to efficiently approach problems requiring processing very large volumes of data. This paper presents a pilot study about how to apply the data-intensive computing paradigm to evolutionary computation algorithms. Two representative cases (selectorecombinative genetic algorithms and estimation of distribution algorithms) are presented, analyzed, and discussed. This study shows that equivalent data-intensive computing evolutionary computation algorithms can be easily developed, providing robust and scalable algorithms for the multicore-computing era. Experimental results show how such algorithms scale with the number of available cores without further modification.
HPC and cloud distributed computing, as a journeyPeter Clapham
Introducing an internal cloud brings new paradigms, tools and infrastructure management. When placed alongside traditional HPC the new opportunities are significant But getting to the new world with micro-services, autoscaling and autodialing is a journey that cannot be achieved in a single step.
PuppetDB: Sneaking Clojure into Operationsgrim_radical
The document provides an overview of PuppetDB, which is a system for storing and querying data about infrastructure as code and system configurations. Some key points:
- PuppetDB stores immutable data about systems and allows querying of this data to enable higher-level infrastructure operations.
- It uses techniques like command query responsibility separation (CQRS) to separate write and read pipelines for better performance and reliability.
- The data is stored in a relational database for efficient querying, and queries are expressed in an abstract syntax tree (AST)-based language.
- The system is designed for speed, reliability, and ease of deployment in operations. It leverages techniques from Clojure and the JVM.
This document provides an outline and overview of a tutorial on executing applications on the Grid with COMPSs. The tutorial will cover the IS-ENES project, grid technology requirements for climate modeling, the COMPSs programming model, programming with COMPSs, COMPSs examples, and a hands-on session. It will also cover collecting requirements and concluding remarks. The programming model section will discuss StarSs, COMPSs objectives, the COMPSs programming steps, and the COMPSs IDE. Configuration of COMPSs projects and resources on grids and clouds will also be demonstrated.
The document discusses software defined networking (SDN) from the perspective of network engineers. It describes how SDN aims to provide more choice and control over networks similar to how virtualization has transformed computing. SDN separates the control plane, which directs traffic, from the forwarding plane, which sends packets. This allows network administrators to program how traffic flows using a centralized controller and OpenFlow protocol. The document also covers related topics like network functions virtualization, overlay networks, and white-box networking using Linux.
Evolution of unix environments and the road to faster deploymentsRakuten Group, Inc.
1. In the 1960s, Ken Thompson created the video game "Space Travel" while working on the Multics Operating System at Bell Labs. When Bell Labs withdrew from the project, Thompson rewrote Space Travel on an old PDP-7 machine. The tools created for the game later became the Unix operating system.
2. Virtualization successfully decoupled hardware from services, allowing easy provisioning of virtual machines (VMs) from standard templates. This simplified administration and reduced provisioning time from months to days or immediately.
3. The rise of public cloud and internal virtualization drove the creation of DevOps approaches to fully automate the software development lifecycle from code to deployment. This automation reduced friction
Kubernetes for java developers - Tutorial at Oracle Code One 2018Anthony Dahanne
You’re a Java developer? Already familiar with Docker? Want to know more about Kubernetes and its ecosystem for developers? During this session, you’ll get familiar with core Kubernetes concepts (pods, deployments, services, volumes, and so on) before seeing the most-popular and most-productive Kubernetes tools in action, with a special focus on Java development. By the end of the session, you’ll have a better understanding of how you can leverage Kubernetes to speed up your Java deployments on-premises or to any cloud.
Osi week10(1) [autosaved] by Gulshan K Maheshwari(QAU)GulshanKumar368
The document discusses the OSI model, which outlines the steps needed to send data from one computer to another. It describes the seven layers of the OSI model and what each layer is responsible for.
The physical layer deals with physical components like cabling and data encoding. The data link layer places and retrieves data from the physical layer and provides error detection. The network layer provides addressing and routing between networks. The transport layer provides reliable data delivery. The session layer allows applications to maintain ongoing sessions. The presentation layer prepares data to be processed. The application layer gives access to network resources. Data moves down the layers at the source and up at the destination, with each layer adding its own header.
Sanger, upcoming Openstack for Bio-informaticiansPeter Clapham
Delivery of a new Bio-informatics infrastructure at the Wellcome Trust Sanger Center. We include how to programatically create, manage and provide providence for images used both at Sanger and elsewhere using open source tools and continuous integration.
The document provides an overview of MapReduce and its evolution from Google's internal implementation to the open source Hadoop framework, describing how MapReduce works, its key components like HDFS and YARN, and some examples of its usage for common tasks like word counting and analytics. It also shares some details about Nokia's implementation and use of Hadoop for processing large log and reporting data.
This document provides an overview and introduction to Nuclio, an open source serverless platform. Some key points:
- Nuclio is described as a new real-time serverless platform that is comprehensive, open, super fast, and can run anywhere. It provides high performance for serverless functions.
- It supports a variety of programming languages and event sources. Functions can be easily deployed and auto-scaled.
- Nuclio is presented as a solution for complex serverless applications through its data integration capabilities and high performance. Examples of real-time applications using Nuclio are described like fleet management, surveillance, and more.
Similar to Holistic Aggregate Resource Environment (20)
This document discusses scaling computing from one core to one trillion cores using ARM architecture. It outlines ARM's research areas for developing serverless hardware and software to efficiently handle large volumes of data from internet-of-things sensors. The goal is to transform infrastructure through more intelligent, flexible cloud architectures using workload-optimized data centers from the edge to the core. Key research focuses on scheduling events efficiently and programming smart accelerators to process and pipeline data from little data to big data.
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Eric Van Hensbergen
The document discusses ARM's approach to future high performance computing (HPC) node architectures. It advocates for balance, flexibility, and partnership in developing exascale systems. ARM believes that a heterogeneous approach using different core types optimized for various workloads, flexible memory technologies, and partnerships to develop the software ecosystem will help overcome challenges in achieving exascale performance while improving power efficiency.
This document discusses ARM and its potential in high performance computing (HPC). It notes that embedded HPC is not new and shows the historical increase in processing power of chips over time. It describes ARM's business model of focusing on designing and licensing intellectual property building blocks that are used to build system-on-chips. ARM fosters an ecosystem of suppliers to deliver cost-effective solutions using these standardized pieces. The document discusses ARM's goal of building a prototype ARM-based supercomputer using commercially available embedded technology to demonstrate competitive performance per watt for HPC applications.
Simulation Directed Co-Design from Smartphones to SupercomputersEric Van Hensbergen
SystemExplorer is a system simulation framework based upon the open-source gem5 simulation infrastructure. It includes a rich collection of hardware components such as ARM cores, interconnect, memories and memory controllers, IO devices - ethernet, PCIe, and other peripherals. In addition it provides support for run fully featured operating systems such as Linux and Android combined with pre-packaged filesystem images that contain real workloads and benchmarks for Smartphone, Server and High Performance Computing. In this talk I'll give an overview of ARM R&D's use of the SystemExplorer tool for workload directed architectural co-design. I will focus on how we are using it in combination with the Department of Energy's co-design center proxy applications to help evaluate and enable the ARM architecture to address the power-efficiency, performance, and resilience requirements of Exascale computing.
(Presented during FastPass 2013 Workshop in Austin, TX)
This document summarizes the BRASIL distributed computing framework. It discusses the motivation for BRASIL to support large-scale data-intensive computing problems. It then provides overviews of the key techniques used in BRASIL, including its PUSH dataflow model, file system interfaces, command line tools, and microbenchmarks evaluating performance.
Dan Schatzberg, Jonathan Appavoo, Orran Krieger, and Eric Van Hensbergen. Scalable elastic systems architecture. In Proceedings of the ASPLOS Runtime Environment/Systems, Layering, and Virtualized
Environments (RESoLVE) Workshop. ASPLOS, March 2011.
This document discusses multi-pipe filesystems (mpipefs), which allow multiple processes to communicate through pipes in more flexible ways than traditional pipes. Mpipefs supports long packets, header blocks, enumerated and collective pipes like broadcast and reduce. It also allows splicing pipes together. Examples show mounting mpipefs and using pipes to pass data and arguments between processes.
VirtFS is a paravirtual file system that allows passing file systems between the KVM hypervisor and guest virtual machines. It uses the Plan 9 protocol for communication and has a server implemented in QEMU and a client in the guest kernel. VirtFS aims to provide better performance than virtual disks or network-based file systems by avoiding layers of indirection and enabling features like sharing files between guests.
This document summarizes Push, a dataflow shell that allows users to define dataflow pipelines using shell-like syntax. The shell treats everything as a pipe and aims to orchestrate dataflow execution across multiple machines. It supports features like record handling, output/input record filtering, and configurable parallelism. Research challenges include optimizing exascale pipelines, cloud integration, and work stealing. The goal is an interactive system for defining and executing data-parallel workflows across platforms.
This document describes the XCPU system, which aims to address limitations in existing cluster-based schedulers. XCPU establishes a hierarchical namespace of cluster services and mounts remote servers automatically based on references. It exports local services for remote use within the network. XCPU allows process creation and execution of groups of processes on remote systems via the file system. It provides a control file syntax for reserving resources and running tasks across multiple servers.
IBM Research provides a document on analyzing 9P traces and walking through 9P protocol code. The document contains 9P traces for common operations like mount, open/write/close on Plan 9 and Linux. It also provides a high-level overview of the code organization, core network interfaces, important data structures, and pointers for reviewing the 9P protocol code in Linux.
IBM Research presented an overview of the 9P network file system protocol. 9P was developed for the Plan 9 distributed operating system and allows files and resources to be accessed over a network similarly to how they are accessed locally. It uses a simple request-response model and supports operations like opening, reading, writing and stat'ing files. The 9P protocol has been implemented for several platforms and extended to support UNIX semantics.
The document describes PUSH, a distributed shell for Unix pipelines that runs across computer clusters. PUSH allows users to compose and run multi-stage data analysis pipelines in a distributed manner similar to traditional Unix pipelines. It uses a fan-out/fan-in model to distribute data across nodes. The authors provide background on Unix pipelines and the PUSH concept. They discuss the prototype implementation and status as well as future work.
The document describes Libra, a library operating system for the Java Virtual Machine (JVM). Libra aims to provide customized OS support for applications through a library OS approach instead of using a general purpose OS. It leverages virtualization to run JVMs within a user partition without a full OS. The document outlines the J9/Libra architecture, describes optimizations for file caching and socket streaming for the Nutch/Lucene query application, and presents initial performance results showing improvements over running on Linux. It concludes by discussing next steps and related work.
The PROSE approach allows running applications in stand-alone partitions with an easy-to-use execution environment. It enables the creation of specialized kernels as easily as developing an application library. Resource sharing between library-OS partitions and traditional partitions keeps library-OS kernels simple and reliable. Extensions allow bridging resource sharing and management across an entire cluster with a unified communication protocol.
The document discusses PROSE (Partitioned Reliable Operating System Environment), an approach that runs applications in specialized kernel partitions for finer control over system resources and improved reliability. It aims to simplify development of specialized kernels and enable resource sharing across partitions. The approach is evaluated using IBM's research hypervisor rHype, which shows PROSE can reduce noise and provide more deterministic performance than Linux. Future work focuses on running larger commercial workloads and further performance/noise experiments.
The document describes a system called Libra, formerly known as PROSE, which aims to run applications in isolated partitions for improved control, reliability and performance. It allows creating specialized kernels as easily as applications. Resources are shared between partitions using the 9P2000 protocol over a unified file namespace. This allows finer-grained control over system services while maintaining simplicity and reliability.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyScyllaDB
Freshworks creates AI-boosted business software that helps employees work more efficiently and effectively. Managing data across multiple RDBMS and NoSQL databases was already a challenge at their current scale. To prepare for 10X growth, they knew it was time to rethink their database strategy. Learn how they architected a solution that would simplify scaling while keeping costs under control.
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
Dandelion Hashtable: beyond billion requests per second on a commodity serverAntonios Katsarakis
This slide deck presents DLHT, a concurrent in-memory hashtable. Despite efforts to optimize hashtables, that go as far as sacrificing core functionality, state-of-the-art designs still incur multiple memory accesses per request and block request processing in three cases. First, most hashtables block while waiting for data to be retrieved from memory. Second, open-addressing designs, which represent the current state-of-the-art, either cannot free index slots on deletes or must block all requests to do so. Third, index resizes block every request until all objects are copied to the new index. Defying folklore wisdom, DLHT forgoes open-addressing and adopts a fully-featured and memory-aware closed-addressing design based on bounded cache-line-chaining. This design offers lock-free index operations and deletes that free slots instantly, (2) completes most requests with a single memory access, (3) utilizes software prefetching to hide memory latencies, and (4) employs a novel non-blocking and parallel resizing. In a commodity server and a memory-resident workload, DLHT surpasses 1.6B requests per second and provides 3.5x (12x) the throughput of the state-of-the-art closed-addressing (open-addressing) resizable hashtable on Gets (Deletes).
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsDianaGray10
Join us to learn how UiPath Apps can directly and easily interact with prebuilt connectors via Integration Service--including Salesforce, ServiceNow, Open GenAI, and more.
The best part is you can achieve this without building a custom workflow! Say goodbye to the hassle of using separate automations to call APIs. By seamlessly integrating within App Studio, you can now easily streamline your workflow, while gaining direct access to our Connector Catalog of popular applications.
We’ll discuss and demo the benefits of UiPath Apps and connectors including:
Creating a compelling user experience for any software, without the limitations of APIs.
Accelerating the app creation process, saving time and effort
Enjoying high-performance CRUD (create, read, update, delete) operations, for
seamless data management.
Speakers:
Russell Alfeche, Technology Leader, RPA at qBotic and UiPath MVP
Charlie Greenberg, host
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...Alex Pruden
Folding is a recent technique for building efficient recursive SNARKs. Several elegant folding protocols have been proposed, such as Nova, Supernova, Hypernova, Protostar, and others. However, all of them rely on an additively homomorphic commitment scheme based on discrete log, and are therefore not post-quantum secure. In this work we present LatticeFold, the first lattice-based folding protocol based on the Module SIS problem. This folding protocol naturally leads to an efficient recursive lattice-based SNARK and an efficient PCD scheme. LatticeFold supports folding low-degree relations, such as R1CS, as well as high-degree relations, such as CCS. The key challenge is to construct a secure folding protocol that works with the Ajtai commitment scheme. The difficulty, is ensuring that extracted witnesses are low norm through many rounds of folding. We present a novel technique using the sumcheck protocol to ensure that extracted witnesses are always low norm no matter how many rounds of folding are used. Our evaluation of the final proof system suggests that it is as performant as Hypernova, while providing post-quantum security.
Paper Link: https://eprint.iacr.org/2024/257
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving
Manufacturing custom quality metal nameplates and badges involves several standard operations. Processes include sheet prep, lithography, screening, coating, punch press and inspection. All decoration is completed in the flat sheet with adhesive and tooling operations following. The possibilities for creating unique durable nameplates are endless. How will you create your brand identity? We can help!
"Choosing proper type of scaling", Olena SyrotaFwdays
Imagine an IoT processing system that is already quite mature and production-ready and for which client coverage is growing and scaling and performance aspects are life and death questions. The system has Redis, MongoDB, and stream processing based on ksqldb. In this talk, firstly, we will analyze scaling approaches and then select the proper ones for our system.
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
In the realm of cybersecurity, offensive security practices act as a critical shield. By simulating real-world attacks in a controlled environment, these techniques expose vulnerabilities before malicious actors can exploit them. This proactive approach allows manufacturers to identify and fix weaknesses, significantly enhancing system security.
This presentation delves into the development of a system designed to mimic Galileo's Open Service signal using software-defined radio (SDR) technology. We'll begin with a foundational overview of both Global Navigation Satellite Systems (GNSS) and the intricacies of digital signal processing.
The presentation culminates in a live demonstration. We'll showcase the manipulation of Galileo's Open Service pilot signal, simulating an attack on various software and hardware systems. This practical demonstration serves to highlight the potential consequences of unaddressed vulnerabilities, emphasizing the importance of offensive security practices in safeguarding critical infrastructure.
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframePrecisely
Inconsistent user experience and siloed data, high costs, and changing customer expectations – Citizens Bank was experiencing these challenges while it was attempting to deliver a superior digital banking experience for its clients. Its core banking applications run on the mainframe and Citizens was using legacy utilities to get the critical mainframe data to feed customer-facing channels, like call centers, web, and mobile. Ultimately, this led to higher operating costs (MIPS), delayed response times, and longer time to market.
Ever-changing customer expectations demand more modern digital experiences, and the bank needed to find a solution that could provide real-time data to its customer channels with low latency and operating costs. Join this session to learn how Citizens is leveraging Precisely to replicate mainframe data to its customer channels and deliver on their “modern digital bank” experiences.
Discover top-tier mobile app development services, offering innovative solutions for iOS and Android. Enhance your business with custom, user-friendly mobile applications.
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Holistic Aggregate Resource Environment
1. Holistic Aggregate
Resource Environment
Eric Van Hensbergen (IBM Research)
Ron Minnich (Sandia National Labs)
Jim McKie (Bell Labs)
Charles Forsyth (Vita Nuova)
David Eckhardt (CMU)
3. Research Topics
• Pre-requisite: reliability and application driven design is pervasive in all
explored areas
• Offload/Acceleration Deployment Model
• Supercomputer needs to become an extension of scientist's desktop
as opposed to batch driven, non-standard run-time environment.
• Leverage aggregation as a first-class systems construct to help manage
complexity and provide a foundation for scalability, reliability, and
efficiency.
• Distribute system services throughout the machine (not just on io-node)
• Interconnect Abstractions & Utilization
• Leverage HPC interconnects in system services (file system, etc.)
• sockets & TCP/IP don't map well to HPC interconnects (torus and
collective) and are inefficient when hardware provides reliability
4. Right Weight Kernel
• General purpose multi-thread, multi-user environment
• Pleasantly Portable
• Relatively Lightweight (relative to Linux)
• Core Principles
• All resources are synthetic file hierarchies
• Local & remote resources accessed via simple API
• Each thread can dynamically organize local and
remote resources via dynamic private namespace
5. Aggregation
• extend BG/P aggregation model
beyond I/O and CPU node barrier
• allow grouping of nodes into
collaborating aggregates with
distributed system services and
dedicated service nodes
• allow specialized kernel for file local service proxy service aggregate service
service, monitoring, checkpointing,
and network routing
• parameterized redundancy, reliability,
local service
and scaling
• allows dynamic (re-) organization of
programming model to match the
(changing) workload
remote services
7. Desktop Extension
• Users want super computers to be an extension of their desktop
• Current parallel model is traditional batch model
• Workloads must use specialized compilers and be scheduled from special
front-end node. Results are collected into a separate file system
• Monitoring and job control through web interface or MMCS command line
• Very difficult development environment and lack of interactivity limits
productivity of execution environment
• Proposed Research
• leverage library OS commercial scale-out work to allow tighter
coupling between desktop environment and super computer resources
• Construct runtime environment which includes some reasonable subset
of support for typical Linux run-time requirements (glibc, python, etc.)
8. Extension Example
app brasil brasil app app
osx internet Linux Plan 9 Plan 9
Mac pSeries I/O CPU
ssh 10GB Ether collective
...
torus
9. Native Interconnects
• Blue Gene specialized networks are used primarily by user
space run-time
• Hardware is directly accessed by user space runtime time
environment and are not shared leading to poor utilization
• Exclusive use of tree network for I/O limits bandwidth and
reliability
• Proposed Solution
• Light weight system software interfaces to interconnects
so that they can be leveraged for system management,
monitoring, and resource sharing as well as user
applications
10. Protocol Exploration
• The Blue Gene networks are unusual (eg, 3D torus carrying 240-byte payloads)
• IP Works, but isn’t well matched to the underlying capabilities
• We want an efficient transport protocol to carry 9P messages & other data
streams
• Related Work: IBM’s ‘one-sided’ messaging operations [Blocksome et al]
• It supports both MPI and non-MPI applications such as Global Arrays
• Inspired by the IBM messaging protocol, we think we might do better than just IP
• Years ago there was much work on lightweight protocols for high-speed
networks
• We are using ideas from that earlier research to implement an efficient protocol
to carry 9P conversations
11. Project Roadmap
0 1 2 3
Hardware Support
Systems Infrastructure
Evaluation, Scaling, & Tuning
Year
15. PUSH
!"#$$%
,-.# ,-.#
&'(()*+
,-.# !"#$$% ,-.#
&'(()*+
,-.# !"#$$% ,-.#
&'(()*+
!"#$$% !"#$$%
,-.# /0$1-.$#2'3 4#(0$1-.$#2'3 ,-.#
&'(()*+ &'(()*+
,-.# !"#$$% ,-.#
&'(()*+
,-.# !"#$$% ,-.#
&'(()*+
!"#$$%
,-.# &'(()*+ ,-.#
push -c ’{ Figure 1: The structure of the PUSH shell
ORS=./blm.dis
du -an files |< xargs os chasen | awk ’{print $1}’ | sort | uniq -c >| sort -rn
}’
We have added two additional pipeline operators, a multiplexing fan-out(|<[n]), and a coalescing
fan-in(>|). This combination allows PUSH to distribute I/O to and from multiple simultaneous
threads of control. The fan-out argument n specifies the desired degree of parallel threading. If no
argument is specified, the default of spawning a new thread per record (up to the limit of available
cores) is used. This can also be overriden by command line options or environment variables. The
pipeline operators provide implicit grouping semantics allowing natural nesting and composibility.
While their complimentary nature usually lead to symmetric mappings (where the number of fan-
outs equal the number of fan-ins), there is nothing within our implementation which enforces it.
17. Strid3
Y= AX + Y
Time for 1024 iterations
Time
in
seconds
“Stride”, i.e. distance between scalars
18. Application Support
• Native
• Inferno Virtual Machine
• CNK Binary Support
• Elf Converter
• Extended proc interface to mark processes as “cnk procs”
• Transition once the process execs, and not before
• Shim in syscall trap code to adapt arg passing conventions
• Linux Binary Support
• Basic Linux binary support
• Functional enough to run basic programs (Python, etc.)
19. Publications
• Unified Execution Model for Cloud Computing; Eric Van Hensbergen, Noah Evans, Phillip Stanley-
Marbell. Submitted to LADIS 2009; October 2009.
• PUSH, a DISC Shell; Eric Van Hensbergen, Noah Evans. To Appear in the Proceedings of the Principles of
Distributed Computing Conference; August 2009.
• Measuring Kernel Throughput on BG/P with the Plan 9 Research Operating System; Ron Minnich, John
Floren, Aki Nyrhinen. Submitted to SC 09; November 2009.
• XCPU2: Distributed Seamless Desktop Extension; Eric Van Hensbergen, Latchesar Ionkov. Submitted to
IEEE Clusters 2009; October 2009.
• Service Oriented File Systems; Eric Van Hensbergen, Noah Evans, Phillip Stanley-Marbell. IBM Research
Report (RC24788), June 2009
• Experiences Porting the Plan 9 Research Operating System to the IBM Blue Gene Supercomputers; Ron
Minnich, Jim McKie. To appear in the Proceedings of the International Conference on Supercomputing
(ISC); June 2009.
• System Support for Many Task Computing; Eric Van Hensbergen and Ron Minnich. In the Proceedings of
the Workshop on Many Task Computing on Grids and Supercomputers; November 2008.
• Holistic Aggregate Resource Environment; Charles Forsyth, Jim McKie, Ron Minnich and Eric Van
Hensbergen. In the ACM Operating Systems Review; January 2008.
• Night of the Lepus: A Plan 9 Perspective on Blue Gene's Interconnects; Charles Forsyth, Jim McKie, Ron
Minnich and Eric Van Hensbergen. In the proceedings of the second annual international workshop on
Plan 9; December 2007
• Petascale Plan 9. USENIX 2007
20. Next Steps
• Infrastructure Scale Out
• File Services
• Command Execution
• Alternate Internode Communication Models
• Fail in place software RAS models
• Applications (Linux binaries and native support)
• Large Scale LINPACK Run
• Explore Mantevo Application Suite
• (http://software.sandia.gov/mantevo)
• CMU Working on Native Quake port
21. Acknowledgments
• Computational Resources Provided by
DOE INCITE Program. Thanks to the
patient folks at ANL who have supported
us bringing up Plan 9 on their development
BG/P
• Thanks to IBM Research Blue Gene team
and the Kittyhawk Team for guidance and
support.
24. IBM Research, Sandia National Labs, Bell Labs, and CMU
24 Systems Support for Many Task Computing 11/17/2008 (c) 2008 IBM Corporation
25. IBM Research, Sandia National Labs, Bell Labs, and CMU
Plan 9 Characteristics
Kernel Breakdown - Lines of Code
Architecture Specific Code
BG/P: ~14,000 lines of code
Portable Code
Port: ~25,000 lines of code
TCP/IP Stack: ~14,000 lines of code
Binary Sizes
415k Text + 140k Data + 107k BSS
25 Systems Support for Many Task Computing 11/17/2008 (c) 2008 IBM Corporation
26. IBM Research, Sandia National Labs, Bell Labs, and CMU
Why not Linux?
Not a distributed system
Core systems inflexible
VM based on x86 MMU
Networking tightly tied to sockets & TCP/IP w/long call-path
Typical installations extremely overweight and noisy
Benefits of modularity and open-source advantages overcome by complexity, dependencies, and rapid rate
of change
Community has become conservative
Support for alternative interfaces waning
Support for large systems which hurts small systems not acceptable
Ultimately a customer constraint
FastOS was developed to prevent OS monoculture in HPC
Few Linux projects were even invited to submit final proposals
26 Systems Support for Many Task Computing 11/17/2008 (c) 2008 IBM Corporation
27. IBM Research, Sandia National Labs, Bell Labs, and CMU
Everything Represented as File Systems
Hardware System Application
Devices Services Services
Disk TCP/IP Stack DNS
/dev/hda1 /net /net
/arp /cs
/udp /dns
/dev/hda2 /tcp
/clone
/stats GUI
/win
/0
/clone
/1
Network /ctl
/0
/data
/1
/listen /ctl
/local /
/dev/eth0
/remote data
/status /
refresh
/2
Console, Audio, Etc. Process Control, Wiki, Authentication,
Debug, Etc. and Service Control
27 Systems Support for Many Task Computing 11/17/2008 (c) 2008 IBM Corporation
28. IBM Research, Sandia National Labs, Bell Labs, and CMU
Plan 9 Networks Screen
Phone PDA
Smartphone
Set Top Box
)
) )
Term
Term Term Term Wifi/Edge
Cable/DSL
Internet
LAN (1 GB/s) Network
File CPU CPU
Server Servers Servers
Content
Addressable
Storage High Bandwidth (10 GB/s) Network
28 Systems Support for Many Task Computing 11/17/2008 (c) 2008 IBM Corporation
29. IBM Research, Sandia National Labs, Bell Labs, and CMU
Aggregation as a First Class Concept
Local Service Proxy Service Aggregate Service
Remote Service Remote Service Remote Service
29 Systems Support for Many Task Computing 11/17/2008 (c) 2008 IBM Corporation
30. IBM Research, Sandia National Labs, Bell Labs, and CMU
Issues of Topology
30 Systems Support for Many Task Computing 11/17/2008 (c) 2008 IBM Corporation
31. IBM Research, Sandia National Labs, Bell Labs, and CMU
File Cache Example
Proxy Service
Monitors access to remote file server & local resources
Local cache mode
Collaborative cache mode
Designated cache server(s)
Integrate replication and redundancy
Explore write coherence via “territories” ala Envoy
Based on experiences with Xget deployment model
Leverage natural topology of machine where
possible.
31 Systems Support for Many Task Computing 11/17/2008 (c) 2008 IBM Corporation
32. IBM Research, Sandia National Labs, Bell Labs, and CMU
Monitoring Example
Distribute monitoring throughout the system
Use for system health monitoring and load balancing
Allow for application-specific monitoring agents
Distribute filtering & control agents at key points in
topology
Allow for localized monitoring and control as well as
high-level global reporting and control
Explore both push and pull methods of modeling
Based on experiences with supermon system.
32 Systems Support for Many Task Computing 11/17/2008 (c) 2008 IBM Corporation
33. IBM Research, Sandia National Labs, Bell Labs, and CMU
Workload Management Example
Provide file system interface to job execution and
scheduling.
Allows scheduling of new work from within the
cluster, using localized as well as global scheduling
controls.
Can allow for more organic growth of workloads as
well as top-down and bottom-up models.
Can be extended to allow direct access from end-
user workstations.
Based on experiences with Xcpu mechanism.
33 Systems Support for Many Task Computing 11/17/2008 (c) 2008 IBM Corporation
34. IBM Research, Sandia National Labs, Bell Labs, and CMU
Right Weight Kernels Project (Phase I)
Motivation
OS Effect on Applications
Metric is based on OS Interference on FWQ & FTQ benchmarks.
AIX/Linux has more capability than many apps need
LWK and CNK have less capability than apps want
Approach
Customize the kernel to the application
Ongoing Challenges
Need to balance capability with overhead
34 Systems Support for Many Task Computing 11/17/2008 (c) 2008 IBM Corporation
35. IBM Research, Sandia National Labs, Bell Labs, and CMU
Why Blue Gene?
Readily available large-scale cluster
Minimum allocation is 37 nodes
Easy to get 512 and 1024 node configurations
Up to 8192 nodes available upon request internally
FastOS will make 64k configuration available
DOE interest – Blue Gene was a specified target
Variety of interconnects allows exploration of alternatives
Embedded core design provides simple architecture that is quick to port to
and doesn't require heavy weight systems software management, device
drivers, or firmware
35 Systems Support for Many Task Computing 11/17/2008 (c) 2008 IBM Corporation