A science-gateway for workflow executions: online and non-clairvoyant self-h...Rafael Ferreira da Silva
PhD Thesis presented on November 29th 2013 at INSA-Lyon
Abstract - Science gateways, such as the Virtual Imaging Platform (VIP), enable transparent access to distributed computing and storage resources for scientific computations. However, their large scale and the number of middleware systems involved lead to many errors and faults. In practice, science gateways are often backed by substantial support staff who monitors running experiments by performing simple yet crucial actions such as rescheduling tasks, restarting services, killing misbehaving runs or replicating data files to reliable storage facilities. Fair quality of service (QoS) can then be delivered, yet with important human intervention. Automating such operations is challenging for two reasons. First, the problem is online by nature because no reliable user activity prediction can be assumed, and new workloads may arrive at any time. Therefore, the considered metrics, decisions and actions have to remain simple and to yield results while the application is still executing. Second, it is non-clairvoyant due to the lack of information about applications and resources in production conditions. Computing resources are usually dynamically provisioned from heterogeneous clusters, clouds or desktop grids without any reliable estimate of their availability and characteristics. Models of application execution times are hardly available either, in particular on heterogeneous computing resources. In this thesis, we propose a general healing process for autonomous detection and handling of operational incidents in workflow executions. Instances are modeled as Fuzzy Finite State Machines (FuSM) where state degrees of membership are determined by an external healing process. Degrees of membership are computed from metrics assuming that incidents have outlier performance, e.g. a site or a particular invocation behaves differently than the others. Based on incident degrees, the healing process identifies incident levels using thresholds determined from the platform history. A specific set of actions is then selected from association rules among incident levels.
For more information visit http://www.rafaelsilva.com
Task Resource Consumption Prediction for Scientific Applications and WorkflowsRafael Ferreira da Silva
Presentation held at the Algorithms and Scheduling Techniques to Manage Resilience and Power Consumption in Distributed Systems 2015 Seminar - Dagstuhl
Estimates of task runtime, disk space usage, and memory consumption, are commonly used by scheduling and resource provisioning algorithms to support efficient and reliable scientific application executions. Such algorithms often assume that accurate estimates are available, but such estimates are difficult to generate in practice. In this work, we first profile real scientific applications and workflows, collecting fine-grained information such as process I/O, runtime, memory usage, and CPU utilization. We then propose a method to automatically characterize task requirements based on these profiles. Our method estimates task runtime, disk space, and peak memory consumption. It looks for correlations between the parameters of a dataset, and if no correlation is found, the dataset is divided into smaller subsets using the statistical recursive partitioning method and conditional inference trees to identify patterns that characterize particular behaviors of the workload. We then propose an estimation process to predict task characteristics of scientific applications based on the collected data. For scientific workflows, we propose an online estimation process based on the MAPE-K loop, where task executions are monitored and estimates are updated as more information becomes available. Experimental results show that our online estimation process results in much more accurate predictions than an offline approach, where all task requirements are estimated prior to workflow execution.
Hanborq Optimizations on Hadoop MapReduceHanborq Inc.
A Hanborq optimized Hadoop Distribution, especially with high performance of MapReduce. It's the core part of HDH (Hanborq Distribution with Hadoop for Big Data Engineering).
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...Rafael Ferreira da Silva
Presentation held at ICCS 2015 Conference - Reykjavik, Iceland
High throughput computing (HTC) has aided the scientific community in the analysis of vast amounts of data and computational jobs in distributed environments. To manage these large workloads, several systems have been developed to efficiently allocate and provide access to distributed resources. Many of these systems rely on job characteristics estimates (e.g., job runtime) to characterize the workload behavior, which in practice is hard to obtain. In this work, we perform an exploratory analysis of the CMS experiment workload using the statistical recursive partitioning method and conditional inference trees to identify patterns that characterize particular behaviors of the workload. We then propose an estimation process to predict job characteristics based on the collected data. Experimental results show that our process estimates job runtime with 75% of accuracy on average, and produces nearly optimal predictions for disk and memory consumption.
More information: www.rafaelsilva.com
And introdution to MR and Hadoop and an view on the opportunities to use MR with databases i.e., SQL-MapReduce by Teradata and In-database MR by Oracle.
The presentation was used during a class of Datenbanken Implementierungstechniken in 2013.
A science-gateway for workflow executions: online and non-clairvoyant self-h...Rafael Ferreira da Silva
PhD Thesis presented on November 29th 2013 at INSA-Lyon
Abstract - Science gateways, such as the Virtual Imaging Platform (VIP), enable transparent access to distributed computing and storage resources for scientific computations. However, their large scale and the number of middleware systems involved lead to many errors and faults. In practice, science gateways are often backed by substantial support staff who monitors running experiments by performing simple yet crucial actions such as rescheduling tasks, restarting services, killing misbehaving runs or replicating data files to reliable storage facilities. Fair quality of service (QoS) can then be delivered, yet with important human intervention. Automating such operations is challenging for two reasons. First, the problem is online by nature because no reliable user activity prediction can be assumed, and new workloads may arrive at any time. Therefore, the considered metrics, decisions and actions have to remain simple and to yield results while the application is still executing. Second, it is non-clairvoyant due to the lack of information about applications and resources in production conditions. Computing resources are usually dynamically provisioned from heterogeneous clusters, clouds or desktop grids without any reliable estimate of their availability and characteristics. Models of application execution times are hardly available either, in particular on heterogeneous computing resources. In this thesis, we propose a general healing process for autonomous detection and handling of operational incidents in workflow executions. Instances are modeled as Fuzzy Finite State Machines (FuSM) where state degrees of membership are determined by an external healing process. Degrees of membership are computed from metrics assuming that incidents have outlier performance, e.g. a site or a particular invocation behaves differently than the others. Based on incident degrees, the healing process identifies incident levels using thresholds determined from the platform history. A specific set of actions is then selected from association rules among incident levels.
For more information visit http://www.rafaelsilva.com
Task Resource Consumption Prediction for Scientific Applications and WorkflowsRafael Ferreira da Silva
Presentation held at the Algorithms and Scheduling Techniques to Manage Resilience and Power Consumption in Distributed Systems 2015 Seminar - Dagstuhl
Estimates of task runtime, disk space usage, and memory consumption, are commonly used by scheduling and resource provisioning algorithms to support efficient and reliable scientific application executions. Such algorithms often assume that accurate estimates are available, but such estimates are difficult to generate in practice. In this work, we first profile real scientific applications and workflows, collecting fine-grained information such as process I/O, runtime, memory usage, and CPU utilization. We then propose a method to automatically characterize task requirements based on these profiles. Our method estimates task runtime, disk space, and peak memory consumption. It looks for correlations between the parameters of a dataset, and if no correlation is found, the dataset is divided into smaller subsets using the statistical recursive partitioning method and conditional inference trees to identify patterns that characterize particular behaviors of the workload. We then propose an estimation process to predict task characteristics of scientific applications based on the collected data. For scientific workflows, we propose an online estimation process based on the MAPE-K loop, where task executions are monitored and estimates are updated as more information becomes available. Experimental results show that our online estimation process results in much more accurate predictions than an offline approach, where all task requirements are estimated prior to workflow execution.
Hanborq Optimizations on Hadoop MapReduceHanborq Inc.
A Hanborq optimized Hadoop Distribution, especially with high performance of MapReduce. It's the core part of HDH (Hanborq Distribution with Hadoop for Big Data Engineering).
Characterizing a High Throughput Computing Workload: The Compact Muon Solenoi...Rafael Ferreira da Silva
Presentation held at ICCS 2015 Conference - Reykjavik, Iceland
High throughput computing (HTC) has aided the scientific community in the analysis of vast amounts of data and computational jobs in distributed environments. To manage these large workloads, several systems have been developed to efficiently allocate and provide access to distributed resources. Many of these systems rely on job characteristics estimates (e.g., job runtime) to characterize the workload behavior, which in practice is hard to obtain. In this work, we perform an exploratory analysis of the CMS experiment workload using the statistical recursive partitioning method and conditional inference trees to identify patterns that characterize particular behaviors of the workload. We then propose an estimation process to predict job characteristics based on the collected data. Experimental results show that our process estimates job runtime with 75% of accuracy on average, and produces nearly optimal predictions for disk and memory consumption.
More information: www.rafaelsilva.com
And introdution to MR and Hadoop and an view on the opportunities to use MR with databases i.e., SQL-MapReduce by Teradata and In-database MR by Oracle.
The presentation was used during a class of Datenbanken Implementierungstechniken in 2013.
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...Rafael Ferreira da Silva
Presentation held at the 11th Workflows in Support of Large-Scale Science, October 14, 2016.
Abstract - Scientific workflows have become mainstream for conducting large-scale scientific research. As a result, many workflow applications and Workflow Management Systems (WMSs) have been developed as part of the cyberinfrastructure to allow scientists to execute their applications seamlessly on a range of distributed platforms. In spite of many success stories, a key challenge for running workflows in distributed systems is failure prediction, detection, and recovery. In this paper, we propose an approach to use control theory developed as part of autonomic computing to predict failures before they happen, and mitigated them when possible. The proposed approach applying the proportional-integral-derivative controller (PID controller) control loop mechanism, which is widely used in industrial control systems, to mitigate faults by adjusting the inputs of the controller. The PID controller aims at detecting the possibility of a fault far enough in advance so that an action can be performed to prevent it from happening. To demonstrate the feasibility of the approach, we tackle two common execution faults of the Big Data era---data storage overload and memory overflow. We define, implement, and evaluate simple PID controllers to autonomously manage data and memory usage of a bioinformatics workflow that consumes/produces over 4.4TB of data, and requires over 24TB of memory to run all tasks concurrently. Experimental results indicate that workflow executions may significantly benefit from PID controllers, in particular under online and unknown conditions. Simulation results show that nearly-optimal executions (slowdown of 1.01) can be attained when using our proposed method, and faults are detected and mitigated far in advance of their occurrence.
As MapReduce clusters have become popular these days, their scheduling is one of the important factor which is to be considered. In order to achieve good performance a MapReduce scheduler must avoid unnecessary data transmission. Hence different scheduling algorithms for MapReduce are necessary to provide good performance. This
slide provides an overview of many different scheduling algorithms for MapReduce.
This slide deck is used as an introduction to the MapReduce programming model, trying hard to be Hadoop-agnostic, as part of the Distributed Systems and Cloud Computing course I hold at Eurecom.
Course website:
http://michiard.github.io/DISC-CLOUD-COURSE/
Sources available here:
https://github.com/michiard/DISC-CLOUD-COURSE
(Slides) Task scheduling algorithm for multicore processor system for minimiz...Naoki Shibata
Shohei Gotoda, Naoki Shibata and Minoru Ito : "Task scheduling algorithm for multicore processor system for minimizing recovery time in case of single node fault," Proceedings of IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2012), pp.260-267, DOI:10.1109/CCGrid.2012.23, May 15, 2012.
In this paper, we propose a task scheduling al-gorithm for a multicore processor system which reduces the
recovery time in case of a single fail-stop failure of a multicore
processor. Many of the recently developed processors have
multiple cores on a single die, so that one failure of a computing
node results in failure of many processors. In the case of a failure
of a multicore processor, all tasks which have been executed
on the failed multicore processor have to be recovered at once.
The proposed algorithm is based on an existing checkpointing
technique, and we assume that the state is saved when nodes
send results to the next node. If a series of computations that
depends on former results is executed on a single die, we need
to execute all parts of the series of computations again in
the case of failure of the processor. The proposed scheduling
algorithm tries not to concentrate tasks to processors on a die.
We designed our algorithm as a parallel algorithm that achieves
O(n) speedup where n is the number of processors. We evaluated
our method using simulations and experiments with four PCs.
We compared our method with existing scheduling method, and
in the simulation, the execution time including recovery time in
the case of a node failure is reduced by up to 50% while the
overhead in the case of no failure was a few percent in typical
scenarios.
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...MLconf
Fast, Cheap and Deep – Scaling Machine Learning: Distributed high throughput machine learning is both a challenge and a key enabling technology. Using a Parameter Server template we are able to distribute algorithms efficiently over multiple GPUs and in the cloud. This allows us to design very fast recommender systems, factorization machines, classifiers, and deep networks. This degree of scalability allows us to tackle computationally expensive problems efficiently, yielding excellent results e.g. in visual question answering.
Hadoop interview questions for freshers and experienced people. This is the best place for all beginners and Experts who are eager to learn Hadoop Tutorial from the scratch.
Read more here http://softwarequery.com/hadoop/
Beyond Map/Reduce: Getting Creative With Parallel ProcessingEd Kohlwey
While Map/Reduce is an excellent environment for some parallel computing tasks, there are many ways to use a cluster beyond Map/Reduce. Within the last year, the YARN and NextGen Map/Reduce has been contributed into the Hadoop trunk, Mesos has been released as an open source project, and a variety of new parallel programming environments have emerged such as Spark, Giraph, Golden Orb, Accumulo, and others.
We will discuss the features of YARN and Mesos, and talk about obvious yet relatively unexplored uses of these cluster schedulers as simple work queues. Examples will be provided in the context of machine learning. Next, we will provide an overview of the Bulk-Synchronous-Parallel model of computation, and compare and contrast the implementations that have emerged over the last year. We will also discuss two other alternative environments: Spark, an in-memory version of Map/Reduce which features a Scala-based interpreter; and Accumulo, a BigTable-style database that implements a novel model for parallel computation and was recently released by the NSA.
Task Scheduling Algorithm for Multicore Processor Systems with Turbo Boost an...Naoki Shibata
Yosuke Wakisaka, Naoki Shibata, Keiichi Yasumoto, Minoru Ito, and Junji Kitamichi : Task Scheduling Algorithm for Multicore Processor Systems with Turbo Boost and Hyper-Threading, In Proc. of The 2014 International Conference on Parallel and Distributed Processing Techniques and Applications(PDPTA'14), pp. 229-235
In this paper, we propose a task scheduling algorithm for multiprocessor systems with Turbo Boost and Hyper-Threading technologies. The proposed algorithm minimizes the total computation time taking account of dynamic changes of the processing speed by the two technologies, in addition to the network contention among the processors. We constructed a clock speed model with which the changes of processing speed with Turbo Boost and Hyper-threading can be estimated for various processor usage patterns. We then constructed a new scheduling algorithm that minimizes the total execution time of a task graph considering network contention and the two technologies. We evaluated the proposed algorithm by simulations and experiments with a multiprocessor system consisting of 4 PCs. In the experiment, the proposed algorithm produced a schedule that reduces the total execution time by 36% compared to conventional methods which are straightforward extensions of an existing method.
Using Simple PID Controllers to Prevent and Mitigate Faults in Scientific Wor...Rafael Ferreira da Silva
Presentation held at the 11th Workflows in Support of Large-Scale Science, October 14, 2016.
Abstract - Scientific workflows have become mainstream for conducting large-scale scientific research. As a result, many workflow applications and Workflow Management Systems (WMSs) have been developed as part of the cyberinfrastructure to allow scientists to execute their applications seamlessly on a range of distributed platforms. In spite of many success stories, a key challenge for running workflows in distributed systems is failure prediction, detection, and recovery. In this paper, we propose an approach to use control theory developed as part of autonomic computing to predict failures before they happen, and mitigated them when possible. The proposed approach applying the proportional-integral-derivative controller (PID controller) control loop mechanism, which is widely used in industrial control systems, to mitigate faults by adjusting the inputs of the controller. The PID controller aims at detecting the possibility of a fault far enough in advance so that an action can be performed to prevent it from happening. To demonstrate the feasibility of the approach, we tackle two common execution faults of the Big Data era---data storage overload and memory overflow. We define, implement, and evaluate simple PID controllers to autonomously manage data and memory usage of a bioinformatics workflow that consumes/produces over 4.4TB of data, and requires over 24TB of memory to run all tasks concurrently. Experimental results indicate that workflow executions may significantly benefit from PID controllers, in particular under online and unknown conditions. Simulation results show that nearly-optimal executions (slowdown of 1.01) can be attained when using our proposed method, and faults are detected and mitigated far in advance of their occurrence.
As MapReduce clusters have become popular these days, their scheduling is one of the important factor which is to be considered. In order to achieve good performance a MapReduce scheduler must avoid unnecessary data transmission. Hence different scheduling algorithms for MapReduce are necessary to provide good performance. This
slide provides an overview of many different scheduling algorithms for MapReduce.
This slide deck is used as an introduction to the MapReduce programming model, trying hard to be Hadoop-agnostic, as part of the Distributed Systems and Cloud Computing course I hold at Eurecom.
Course website:
http://michiard.github.io/DISC-CLOUD-COURSE/
Sources available here:
https://github.com/michiard/DISC-CLOUD-COURSE
(Slides) Task scheduling algorithm for multicore processor system for minimiz...Naoki Shibata
Shohei Gotoda, Naoki Shibata and Minoru Ito : "Task scheduling algorithm for multicore processor system for minimizing recovery time in case of single node fault," Proceedings of IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2012), pp.260-267, DOI:10.1109/CCGrid.2012.23, May 15, 2012.
In this paper, we propose a task scheduling al-gorithm for a multicore processor system which reduces the
recovery time in case of a single fail-stop failure of a multicore
processor. Many of the recently developed processors have
multiple cores on a single die, so that one failure of a computing
node results in failure of many processors. In the case of a failure
of a multicore processor, all tasks which have been executed
on the failed multicore processor have to be recovered at once.
The proposed algorithm is based on an existing checkpointing
technique, and we assume that the state is saved when nodes
send results to the next node. If a series of computations that
depends on former results is executed on a single die, we need
to execute all parts of the series of computations again in
the case of failure of the processor. The proposed scheduling
algorithm tries not to concentrate tasks to processors on a die.
We designed our algorithm as a parallel algorithm that achieves
O(n) speedup where n is the number of processors. We evaluated
our method using simulations and experiments with four PCs.
We compared our method with existing scheduling method, and
in the simulation, the execution time including recovery time in
the case of a node failure is reduced by up to 50% while the
overhead in the case of no failure was a few percent in typical
scenarios.
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...MLconf
Fast, Cheap and Deep – Scaling Machine Learning: Distributed high throughput machine learning is both a challenge and a key enabling technology. Using a Parameter Server template we are able to distribute algorithms efficiently over multiple GPUs and in the cloud. This allows us to design very fast recommender systems, factorization machines, classifiers, and deep networks. This degree of scalability allows us to tackle computationally expensive problems efficiently, yielding excellent results e.g. in visual question answering.
Hadoop interview questions for freshers and experienced people. This is the best place for all beginners and Experts who are eager to learn Hadoop Tutorial from the scratch.
Read more here http://softwarequery.com/hadoop/
Beyond Map/Reduce: Getting Creative With Parallel ProcessingEd Kohlwey
While Map/Reduce is an excellent environment for some parallel computing tasks, there are many ways to use a cluster beyond Map/Reduce. Within the last year, the YARN and NextGen Map/Reduce has been contributed into the Hadoop trunk, Mesos has been released as an open source project, and a variety of new parallel programming environments have emerged such as Spark, Giraph, Golden Orb, Accumulo, and others.
We will discuss the features of YARN and Mesos, and talk about obvious yet relatively unexplored uses of these cluster schedulers as simple work queues. Examples will be provided in the context of machine learning. Next, we will provide an overview of the Bulk-Synchronous-Parallel model of computation, and compare and contrast the implementations that have emerged over the last year. We will also discuss two other alternative environments: Spark, an in-memory version of Map/Reduce which features a Scala-based interpreter; and Accumulo, a BigTable-style database that implements a novel model for parallel computation and was recently released by the NSA.
Task Scheduling Algorithm for Multicore Processor Systems with Turbo Boost an...Naoki Shibata
Yosuke Wakisaka, Naoki Shibata, Keiichi Yasumoto, Minoru Ito, and Junji Kitamichi : Task Scheduling Algorithm for Multicore Processor Systems with Turbo Boost and Hyper-Threading, In Proc. of The 2014 International Conference on Parallel and Distributed Processing Techniques and Applications(PDPTA'14), pp. 229-235
In this paper, we propose a task scheduling algorithm for multiprocessor systems with Turbo Boost and Hyper-Threading technologies. The proposed algorithm minimizes the total computation time taking account of dynamic changes of the processing speed by the two technologies, in addition to the network contention among the processors. We constructed a clock speed model with which the changes of processing speed with Turbo Boost and Hyper-threading can be estimated for various processor usage patterns. We then constructed a new scheduling algorithm that minimizes the total execution time of a task graph considering network contention and the two technologies. We evaluated the proposed algorithm by simulations and experiments with a multiprocessor system consisting of 4 PCs. In the experiment, the proposed algorithm produced a schedule that reduces the total execution time by 36% compared to conventional methods which are straightforward extensions of an existing method.
This presentation explains first the vision behind the release as Open Source of OpenSplice DDS. Then it highlights the new product structure and licensing model.
Search Party - Internet & Social Media Search Tricks that Will Improve the Wa...Marian Madonia, CSP
Use these models to be a master at doing internet & social media searches. Whether using Google, Bing, Yahoo, LinkedIn, or another search engine, these tips will help you dig deep to find what you need. Created by Marian Madonia, CSP
This presentation covers the key facts you need to know about the current and upcoming PCI compliance requirements.
Key take-aways:
*What are the new PCI Compliance changes (current and planned)
*When the changes go into effect & how they impact your business
*How to automate the PCI Compliance processes
Scott Callaghan from the Southern California Earthquake Center presented this deck in a recent Blue Waters Webinar.
"I will present an overview of scientific workflows. I'll discuss what the community means by "workflows" and what elements make up a workflow. We'll talk about common problems that users might be facing, such as automation, job management, data staging, resource provisioning, and provenance tracking, and explain how workflow tools can help address these challenges. I'll present a brief example from my own work with a series of seismic codes showing how using workflow tools can improve scientific applications. I'll finish with an overview of high-level workflow concepts, with an aim to preparing users to get the most out of discussions of specific workflow tools and identify which tools would be best for them."
Watch the video: http://wp.me/p3RLHQ-gtH
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
How Texas Instruments Uses InfluxDB to Uphold Product Standards and to Improv...DevOps.com
Discover how Texas Instruments uses a time series database to gain better insights into their industrial operations. By collecting high level metrics and events, they are continuously improving productivity and are becoming data-driven. Learn how InfluxDB can result in cost savings and have a direct impact on performance by analyzing seasonality, trends and behaviors.
In this webinar, Michael Hinkle will cover:
Texas Instruments’ ability to innovate while creating electronic systems for multiple industries
How an organization’s machines, tools and processes can impact proficiency
InfluxDB’s impact on their production and quality assurance
Common Design Elements for Data Movement Eli DartEd Dodds
Eli Dart, Network Engineer ESnet Science Engagement Lawrence Berkeley National Laboratory Cosmology CrossConnects Workshop Berkeley, CA February 11, 2015
Apache Hadoop project, and the Hadoop ecosystem has been designed be extremely flexible, and extensible. HDFS, Yarn, and MapReduce combined have more that 1000 configuration parameters that allow users to tune performance of Hadoop applications, and more importantly, extend Hadoop with application-specific functionality, without having to modify any of the core Hadoop code.
In this talk, I will start with simple extensions, such as writing a new InputFormat to efficiently process video files. I will provide with some extensions that boost application performance, such as optimized compression codecs, and pluggable shuffle implementations. With refactoring of MapReduce framework, and emergence of YARN, as a generic resource manager for Hadoop, one can extend Hadoop further by implementing new computation paradigms.
I will discuss one such computation framework, that allows Message Passing applications to run in the Hadoop cluster alongside MapReduce. I will conclude by outlining some of our ongoing work, that extends HDFS, by removing namespace limitations of the current Namenode implementation.
Apache Tez : Accelerating Hadoop Query ProcessingTeddy Choi
호튼웍스 아시아 기술 총괄 이사 제프 마크햄 (Jeff Markham) 이 테즈에 대한 소개를 합니다. 테즈는 맵리듀스를 대체하여 하둡의 질의 처리를 가속하는 소프트웨어입니다. 왜 테즈를 만들었고, 어떻게 구성되었으며, 최적화는 어떻게 진행되고, 그 성능은 얼마나 좋아졌는지 전반에 대해 설명합니다.
Similar to Scientific Applications of The Data Distribution Service (20)
This was the opening presentation of the Zenoh Summit in June 2022. The presentation goes through the motivations that lead to the design of the zenoh protocol and provides an introduction of its core concepts. This is the place to start to understand why you should care about zenoh and the way in which is disrupts existing technologies.
The recording for this presentation is available at https://bit.ly/3QOuC6i
Zenoh is rapidly growing Eclipse project that unifies data in motion, data at rest and computations. It elegantly blends traditional pub/sub with geo distributed storage, queries and computations, while retaining a level of time and space efficiency that is well beyond any of the mainstream stacks. This presentation will provide an introduction to Eclipse Zenoh along with a crisp explanation of the challenges that motivated the creation of this project. We will go through a series of real-world use cases that demonstrate the advantages brought by Zenoh in enabling and optimising typical edge scenarios and in simplifying the development of any scale distributed applications.
Data Decentralisation: Efficiency, Privacy and Fair MonetisationAngelo Corsaro
A presentation give at the European H-Cloud Conference to motivate decentralisation as a mean to improve energy efficiency, privacy, and opportunity for monetisation for your digital footprint.
zenoh: zero overhead pub/sub store/query computeAngelo Corsaro
Unifies data in motion, data in-use, data at rest and computations.
It carefully blends traditional pub/sub with distributed queries, while retaining a level of time and space efficiency that is well beyond any of the mainstream stacks.
It provides built-in support for geo-distributed storages and distributed computations
zenoh -- the ZEro Network OverHead protocolAngelo Corsaro
This presentation introduces the key ideas behind zenoh -- an Internet scale data-centric protocol that unifies data-sharing between any kind of device including those constrained with respect to the node resources, such as computational resources and power, as well as the network.
zenoh -- the ZEro Network OverHead protocolAngelo Corsaro
This presentation introduces the key ideas behind zenoh -- an Internet scale data-centric protocol that unifies data-sharing between any kind of device including those constrained with respect to the node resources, such as computational resources and power, as well as the network.
Fog computing aims at providing horizontal, system-level, abstractions to distribute computing, storage, control and networking functions closer to the user along a cloud-to-thing continuum. Whilst fog computing is increasingly recognised as the key paradigm at the foundation of Consumer and Industrial Internet of Things (IoT), most of the initiatives on fog computing focus on extending cloud infrastructure. As a consequence, these infrastructure fall short in addressing heterogeneity and resource constraints characteristics of fog computing environments.
fog⌀5 (read as fog O-five or fog OS) is an Eclipse IoT Project that is building a fog computing infrastructure from first principle. In other terms, fog⌀5 has been designed to address the challenges induced by fog computing in terms of heterogeneity, decentralisation, resource constraints, geographical scale and security.
This webcast will introduce fog⌀5, motivate its architecture and building blocks as well as provide a demonstration of fog⌀5 provisioning applications that span from the cloud to the things.
The video recording for this presentation is available at https://www.youtube.com/watch?v=Osl3O5DxHF8
Making the right data available at the right time, at the right place, securely, efficiently, whilst promoting interoperability, is a key need for virtually any IoT application. After all, IoT is about leveraging access data – that used to be unavailable – in order to improve the ability to react, manage, predict and preserve a cyber-physical system.
The Data Distribution Service (DDS) is a standard for interoperable, secure, and efficient data sharing, used at the foundation of some of the most challenging Consumer and Industrial IoT applications, such as Smart Cities, Autonomous Vehicles, Smart Grids, Smart Farming, Home Automation and Connected Medical Devices.
In this presentation we will (1) introduce the Eclipse Cyclone DDS project, (2) provide a quick intro that will get you started with Cyclone DDS, (3) present a few Cyclone DDS use cases, and (4) share the Cyclone DDS development road-map.
Fog Computing is a paradigm that complements and extends cloud computing by providing an end-to-end virtualisation of computing, storage and communication resources. As such, fog computing allow applications to be transparently provisioned and managed end-to-end. This presentation first motivates the need for fog computing, then introduced fog05 the first and only Open Source fog computing platform!
Data Sharing in Extremely Resource Constrained EnvionrmentsAngelo Corsaro
This presentation introduces XRCE a new protocol for very efficiently distributing data in resource constrained (power, network, computation, and storage) environments. XRCE greatly improves the wire efficiency of existing protocol and in many cases provides higher level abstractions.
RUSTing is not a tutorial on the Rust programming language.
I decided to create the RUSTing series as a way to document and share programming idioms and techniques.
From time to time I’ll draw parallels with Haskell and Scala, having some familiarity with one of them is useful but not indispensable.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
Scientific Applications of The Data Distribution Service
1. Scientific Applications of
Data Distribution Service
Svetlana Shasharina#, Nanbor Wang,
Rooparani Pundaleeka, James Matykiewicz and
Steve Goldhaber http://www.txcorp.com
# sveta@txcorp.com
3. http://www.txcorp.com
Tech-X Corporation
• Founded in 1994, located in Boulder CO
• 65 people (mostly computational physics, app math,
applied computer science)
• Have been merging CS (C++, CORBA, GRID, MPI, GPU,
complex via and data management) and physics and
looking at DDS
• Funded by DOE, NASA, DOD and sales
• Applications in
– Plasma modeling (accelerators, lasers, fusion devices,
semiconductors) and beam physics
– Nanotechnology
– Data analysis
4. Large Synoptic Survey
Telescope (LSST)
• On ground digital camera to build in
Chile to start in 2020 (?). Funded
by DOE, NASA, university, private
sector
• Up to 2000 images/day +
calibration data -> 30 TB/day
• Processed locally and reprocessed
and archived in Illinois (National
Center for Supercomputing
Applications)
• Uses OpenSplice for control
software
• Can we help with data
management: orchestration of
steps and monitoring of data
processing?
5. NoVA
• NoVA: NuMI Off-axis ve (electron
neutrino) Appearance experiment
• Will generate neutrino beams at
FNAL and send it to a detector in
Ash River, Minnesota (500 mile in < 2 ms)
• DOE funded (many labs and universities)
• RMS (Responsive Messaging System) is DDS-based
system to pass control and status messages in the NoVA
data acquisition system (two types of topics but has many
actual topics to implement point-to-point communications)
• Will eventually need to go over WAN, and provide 10 Hz
status transmissions between ~100 applications
• Simplifies OpenSplice using traits (like simd-cxx) to
minimize the amount of data types and mapping topics to
strings
6. SciDAC-II LQCD
• LQCD: Lattice Quantum Chromodynamics
(computational version of QCD: a theory of strong
interaction involving quarks and gluons making up
hadrons like protons and neutrons)
• DDS is used to perform monitoring of clusters doing
LQCD calculations (detect job failures, evaluate nodes
loads and performance, start/kill etc)
• Topics for monitoring and controls of jobs and
resources
• Use OpenSplice
7. Common themes for scientific
apps and DDS
• RT issues are not well estimated
• Common usability needs
– Support for scientific data formats and data products (from and
out of topics): domain schemas and data transformation tools
– Control and monitoring topics (can we come up with reusable
schema?)
– Simple APIs corresponding to expectations of scientists
– Ease of modification (evolving systems not just production
systems)
– QoS cookbook (how to get correct combinations)
– General education
• Is DDS good for point-to-point (Bill talks only to Pete)
• How one uses DDS without killing the system (memory etc)
• Other requirements
– Site and community specific security
– WAN operation (Chicago and Berkeley, for example)
8. Common extra expectations
• Automated test harness:
– How one tests for correct behavior and QoS
– Avoid regression in rapidly evolving system modified by a
team
• Interacting with databases (all data should be archived and
allow queries)
• Can we do everything using DDS to minimize external
dependencies?
– Efficient bulk data transfer (usually point-to-point and BIG
triangles :-)
– Workflow engine (workflow: loosely coupled applications
often through files and can be distributed, while
simulations is typically tightly coupled on a HPC resource)
• Interacting with Web interfaces and Web Services
9. QuIDS: to address some issues
• QuIDS: Quality Information Distribution System
• Helping the applications above through Phase II SBIR from
DOE (HEP office)
• Collaboration of Tech-X and Fermilab
• Goals (we will talk about the ones in red in rest of this talk):
– Implement a DDS-based system to monitor distributed
processing of astronomical data
• Simplifying C++(done with simd-cxx?) and Python APIs
• Support for FITS and monitoring and control data, images,
histograms, spectra
• Security
• WAN
• Testing harness
– Investigate of of DDS for workflow management
11. QuIDS at FNAL computational
domain
MCTopic SciTopic
Monitor
MCTopic
W W W R R R R W
CampaignManager
MCTopic
MCTopic
Workflow
Application(s) of apps
SciTopic
R W W
Computa-onal
Domain
12. Generic workflows: do we need all?
• Workflow is something outside of HPC (loosely coupled and
can tolerate even WAN, while simulation is something that
goes to qsub…)
• Kepler (de-facto workflow engine expected for DOE
applications):
– Support for complex workflows
– Java based
– Heavy and hard to learn
– Not portable to future platforms (DOE supercomputers might not have
Java at all)
• Real workflows in astronomy are simple (do not
expressivness of full programming language or pi-algebra)
– Pipelines
– DAGs
• How one implements such workflows using DDS?
13. Parallel pipeline: most of
astronomy workflows
Worker(0) Task(0) Task(1) Task(N-1)
Initialize Task(0) Task(1) Task(N-1) Finalize
Worker(2) Task(0) Task(1) Task(N-1)
Tasks can be continued by different
working processes: data can be passed
between them (the Worker(1)
performs Task(1) using data from
Worker (0))
14. ddspipe: current implementation of
workflow engine
• Parallel pipeline job consist of
– Initialization phase (splitting data into manageable pieces) running on
one node
– Parallel worker processes doing data processing tasks (possible not
sharing the same address space)
– Finalization step (merging data into a final image or movie)
• There is an expected order in tasks, so that tasks can be numbered and
output data of a previous step as input to next
• Design decisions for now:
– Workers do not communicate to each other
– Workers are given work by a mediating entity: tuple space manager
(no self-organization)
– No scheduling for now (round-robin: tasks and workers are queued in
the server)
– Workers can get data coming from a task completed by a different
worker (do not address the problem of data transfer now)
• All communication is via DDS topics while data to work on is available
through local files to all workers
15. GDS = Tuple Space but we
want more (?)
• Tickets:
– Task ticket (id, indata, out data, status)
– Task status: standby, ready, running, finished
– Worker ticket (id, task ticket, status)
– Worker status: ready, busy
• Classic tuple space = set of task ticket and we could use only them
but… instead of dealing with a self-organized (wild) system, we
would like to implement
– Workflow: M sequences of tasks with matching in and out data
– Scheduling (based on policies, resources, location of workers)
– Fault-tolerance (detecting and rescheduling of incomplete tasks)
• Hence: we decided to have a class TupleSpaceServer to address
these (currently just pipeline and queues and no FT)
16. ddspipe classes:
• Initializer
– Splits data, possibly creates workers and workspaces, publishes (for
all initial work tickets with correct specification of the workflow
• TupleSpaceServer
– Changes status in task tickets in accordance with the workflow order:
once a worker reports that task n done, a ticket for task n-1 with <n-1
in-data> = <n out-data> is changed to ready and worker topic is
published (with the worker id next in the queue). Once a worker
reports that is doing this work, the status is changed to running etc.
• Workers
– Publish their status
– Listen to task assignment (matching its id to the one in the worker
ticket)
• Finalizer
– Whatever to finish up (merge data and clean)
17. States of Tasks in Tuple Space
Initial jobStatus,Run Eventual
ticket ticket
states states
Taks executed sequentially
Tuplespace
Task0 Task0 internal Task0 Task0
Standby Ready Running Completed
scheduling
taskStatus,Task0,Completed
Tuplespace
Task1 Task1 internal Task1 Task1
Standby Ready Running Completed
scheduling
taskStatus,Taskn-1,Completed
Tuplespace
Taskn Taskn internal Taskn Taskn
Standby Ready Running Completed
scheduling
Sequences proceed independently in parallel
18. Tuple Space Manager Maintains
Tasks Tickets and Schedules Tasks
Tuple Space
ticket Manager jobStatus
Job Job
Job task seq status
Initializer
jobStatus ID ID ID Finalizer
workerStatus jobStatus
workTicket
Worker
ticket Idle Workers
Compl Dispos
Run
eted able
Worker Worker Worker Worker Worker
19. Status and next steps of
ddspipe (beyond what bash can
do :-)
• Prototype working
– Although we do have some issues with memory and bugs
– I would like to experiment with no queuing: next task open for
grabs if one of the tasks of the previous stage is finished
• Next steps
– User definition of workflow
– Multiple jobs
– Separation of worker manager from task manager?
– Implementing workers doing slit-spectroscopy based on
running R
– DAG support
– Some scheduling (balance between data transfer and mixing
data between slow and fast workers?)
– Data transfer implementation and in-memory data exchange
20. Security for scientific projects:
from nothing to everything
• OpenSplice enterprise edition provides Secure Networking
Service:
– Security parameters (i.e., data encryption and authentication) are
d fined in Security Profiles in the OpenSplice configuration file.
– Node authentication and access control are also specified in the
configuration profile.
– Security Profiles are attached to partitions which set the security
parameters in use for that partition.
– Secure networking is activated as a domain Service
• Scientists are not used to pay
• Used to:
– Authenticate and authorize on connection
– Rules are in admin area of a virtual organization (DDS
should consult)
21. Providing security in community
edition of OpenSplice
• Tried to replace the the lower networking layer of
OpenSplice with OpenSSL and see how one can provide
authentication, authorization and encryption
• OpenSSL is an open source toolkit:
– Secure Sockets Layer and Transport Layer Security (new
standard, replacing SSL)
– General purpose cryptography library
– Public Key Infrastructure (PKI): e.g., certificate creation, signing
and checking, CA management
22. Switching to OpenSSL++ in the
networking layer of community
OpenSplice
Applications Applications
Data Centric Publish/Subscibe Data Centric Publish/Subscibe
Real Time Publish/Subscribe Real Time Publish/Subscribe
RT Net DDSi RT Net DDSi
UDP / IP OpenSSL
• The UDP/IP layer handles the interface to the operating-
system’s socket interface
• Switching to OpenSSL allowed us to establish secure
tunnel between two sites
• But the configuration should be done per each two nodes!
23. Future Directions in Security
Work
• Waiting for new development and will be happy to use to
implement what is expected by DOE labs (our
collaborators) security
• Explore user data etc fields to address applications
specifics if this is not addressed by the security profile
24. PyDDS: Python bindings for
DDS communications
• Started with SWIG for wrapping generated bindings:
works fine but needed manual wrapping of multiple
generated classes
• Next worked with Boost.Python for wrapping of
communication layer (set of classes that are used in
the IDLPP generated code to call into OpenSplice for
communication) so that there will be no need to wrap
generated bindings
• Problem: need to take care of inheritance manually
and deal with several handlers that are unexposed C
structs used in forward declarations
25. Status and next steps for
PyDDS
• Hand-wrapping of C++ bindings using SWIG works:
#!/usr/bin/python
import time
import TxddsPy
qos = TxddsPy.Qos()
qos.setTopicReliable()
qos.setWriterReliable()
writer = TxddsPy.RawImageWriter(”rawImage")
writer.setTopicQos(qos)
writer.setWriterQos(qos)
data = TxddsPy.rawImage()
writer.writeData(data)
• Next:
– Investigate wrapping communication classes so that we
expose minimum for Boost.Python
– Or develop tools for generating glue code needed to
Boost.Python using a string list of topics
26. QuIDS summary and future
directions
• We have prototyped DDS-bases solutions for astronomy data
processing applications
– Tools for bringing FITS data into DDS
– Simple QoS scenarios (getting prepublished data)
– Parallel pipeline workflows
– Security studies
– Python APIs
• Possible next steps
– Language to describe workflows for user input into the system
– More complex workflows and concrete implementations
– Implementing FNAL security requirements
– WAN communication between FNAL and LBNL going through firewall
– Streamlining glue code generation for Python
– Testing harness
– Bulk data transfers
– Archiving data into databases
– Web interfaces
27. Acknowledgements
• Ground Data System (GDS) team from Fermi National
Accelerator Laboratory
• PrismTech
• Nikolay Malitsky (BNL)
• OpenSplice mailing list and its contributors
• US Department of Energy, Office of High Energy
Physics