From Zero To Production (NixOS, Erlang) @ Erlang Factory SF 2016Susan Potter
This talk will introduce the audience to the Nix packaging, NixOS, and related ecosystem tools for Erlang/Elixir developers.
By reviewing common development, testing, and deployment problems we will look at what Nix has to offer to aid Erlang/Elixir developers in these areas.
From seamless developer environment bootstrapping to consistent CI environments and beyond.
[COSCUP 2021] A trip about how I contribute to LLVMDouglas Chen
https://coscup.org/2021/zh-TW/session/YBFMNB
1. Motivation
CppNameLint and Clang-Tidy
2. Beginning of a trip
Phabricator, Arcanist, Build, Test, and their flows
3. Tips in a trip
Get out of trouble
4. Trip moments
Happened during the code review
5. The last
From Zero To Production (NixOS, Erlang) @ Erlang Factory SF 2016Susan Potter
This talk will introduce the audience to the Nix packaging, NixOS, and related ecosystem tools for Erlang/Elixir developers.
By reviewing common development, testing, and deployment problems we will look at what Nix has to offer to aid Erlang/Elixir developers in these areas.
From seamless developer environment bootstrapping to consistent CI environments and beyond.
[COSCUP 2021] A trip about how I contribute to LLVMDouglas Chen
https://coscup.org/2021/zh-TW/session/YBFMNB
1. Motivation
CppNameLint and Clang-Tidy
2. Beginning of a trip
Phabricator, Arcanist, Build, Test, and their flows
3. Tips in a trip
Get out of trouble
4. Trip moments
Happened during the code review
5. The last
For the Greater Good: Leveraging VMware's RPC Interface for fun and profit by...CODE BLUE
Virtual machines play a crucial role in modern computing. They often are used to isolate multiple customers with instances on the same physical server. Virtual machines are also used by researchers and security practitioners to isolate potentially harmful code for analysis and review. The assumption being made is that by running in a virtual machine, the potentially harmful code cannot execute anywhere else. However, this is not foolproof, as a vulnerability in the virtual machine hypervisor can give access to the entire system. While this was once thought of as just hypothetical, two separate demonstrations at Pwn2Own 2017 proved this exact scenario.
This talk details the host-to-guest communications within VMware. Additionally, the presentation covers the functionalities of the RPC interface. In this section of the presentation, we discuss the techniques that can be used to record or sniff the RPC requests sent from the Guest OS to the Host OS automatically. We also demonstrate how to write tools to query the RPC Interface in C++ and Python for fuzzing purposes.
Finally, we demonstrate how to exploit Use-After-Free vulnerabilities in VMware by walking through a patched vulnerability.
Exactly Once Semantics Revisited (Jason Gustafson, Confluent) Kafka Summit NY...confluent
Two years ago, we helped to contribute a framework for exactly once semantics (or EOS) to Apache Kafka. This much-needed feature brought transactional guarantees to stream processing engines such as Kafka Streams. In this talk, we will recount the journey since then and the lessons we have learned as usage has gradually picked up steam. What did we get right and what did we get wrong? Most importantly, we will discuss how the work is continuing to evolve in order to provide more reliability and better performance. This talk assumes basic familiarity with Kafka and the log abstraction. What you will get out of it is a deeper understanding of the underlying architecture of the EOS framework in Kafka, what its limitations are, and how you can use it to solve problems.
Functional Operations (Functional Programming at Comcast Labs Connect)Susan Potter
Functional Operations: Packaging, system/configuration building, and testing infrastructure with [Nix] lambda
Maintaining configurations for different kinds of nodes and cloud resources in a [micro]service architecture can be an operational nightmare, especially if not managed with the application codebase. CI and CD job environments diverge from production configuration yielding their results unpredictable at best or produce false positives in the worst case. Code pushes to staging and production can have unintended consequences that can't be reasoned about before deploy and often can’t be inspected thoroughly on a dry run. Leading to unhappy users when problems do arise.
This session will demonstrate the use of the Nix and NixOS ecosystem to define and build packages in a referentially transparent way which can be leveraged as a solid foundation to configure systems and test multiple [virtual] machines with coordinated scenarios. We also look at how reliable packaging allows us to build a consistent CI/CD pipeline where upgrading your version of the JVM doesn't break your CI build servers for days.
Take a Jailbreak -Stunning Guards for iOS Jailbreak- by Kaoru OtsukaCODE BLUE
In this talk, I investigate several exploiting ideas for iOS kernel jailbreak using recently exposed vulnerabilities. Recently, Ian Beer found the following promising vulnerabilities:
CVE-2016-7637: Broken kernel mach port name ‘uref’ handling on iOS/MacOS can lead to privileged port name replacement in other processes,
CVE-2016-7644: XNU kernel UaF due to lack of locking in set_dp_control_port,
CVE-2016-7661: MacOS/iOS arbitrary port replacement in powerd.
However, naive combination of the above vulnerabilities cannot easily break recent mitigations implemented in iOS versions. Recent iOS provides the kernel level mitigations against exploitation such as kernel patch protection, sandboxing, AMFI(Apple Mobile File Integrity), MAC(Mandatory Access Control) policy, KASLR(Kernel ASLR) etc. These mitigations will be briefly explained.
Find out how to build decentralized, fault-tolerant, stateful application services using core concepts and techniques from the Amazon Dynamo paper using riak_core as a toolkit.
syzbot and the tale of million kernel bugsDmitry Vyukov
The root cause of most software exploits is bugs. Hardening, mitigations and containers are important, but they can't protect a system with thousands of bugs. In this presentation, Dmitry Vyukov will review the current [sad] situation with Linux kernel bugs and security implications based on their experience testing kernel for the past 3 years; overview a set of bug finding tools they are developing (syzbot, syzkaller, KASAN, KMSAN, KTSAN); and discuss problems and areas that require community help to improve the situation.
Agenda:
This talk will provide an in-depth review of the usage of canaries in the kernel and the interaction with userspace, as well as a short review of canaries and why they are needed in general so don't be afraid if you never heard of them.
Speaker:
Gil Yankovitch, CEO, Chief Security Researcher from Nyx Security Solutions
Kernel Recipes 2019 - CVEs are dead, long live the CVE!Anne Nicolas
For the Linux kernel, CVEs do not work at all given the rate of fixes being applied and rapidly backported and pushed to users through a huge variety of different ways. The average “request to fix” date for Linux CVEs is -100 days, showing that either no one cares about CVEs for Linux, or engineers are using them to game their internal release processes, or no one happens to notice when the kernel developers resolve an issue, or all of the above. This talk will go into the problems with CVEs when it comes to a fast moving project like Linux, and show the decentralized solution that we have been using for the past 14 years instead. All other open source projects are encouraged to use these same methods to help resolve the problems that CVEs have.
For the Greater Good: Leveraging VMware's RPC Interface for fun and profit by...CODE BLUE
Virtual machines play a crucial role in modern computing. They often are used to isolate multiple customers with instances on the same physical server. Virtual machines are also used by researchers and security practitioners to isolate potentially harmful code for analysis and review. The assumption being made is that by running in a virtual machine, the potentially harmful code cannot execute anywhere else. However, this is not foolproof, as a vulnerability in the virtual machine hypervisor can give access to the entire system. While this was once thought of as just hypothetical, two separate demonstrations at Pwn2Own 2017 proved this exact scenario.
This talk details the host-to-guest communications within VMware. Additionally, the presentation covers the functionalities of the RPC interface. In this section of the presentation, we discuss the techniques that can be used to record or sniff the RPC requests sent from the Guest OS to the Host OS automatically. We also demonstrate how to write tools to query the RPC Interface in C++ and Python for fuzzing purposes.
Finally, we demonstrate how to exploit Use-After-Free vulnerabilities in VMware by walking through a patched vulnerability.
Exactly Once Semantics Revisited (Jason Gustafson, Confluent) Kafka Summit NY...confluent
Two years ago, we helped to contribute a framework for exactly once semantics (or EOS) to Apache Kafka. This much-needed feature brought transactional guarantees to stream processing engines such as Kafka Streams. In this talk, we will recount the journey since then and the lessons we have learned as usage has gradually picked up steam. What did we get right and what did we get wrong? Most importantly, we will discuss how the work is continuing to evolve in order to provide more reliability and better performance. This talk assumes basic familiarity with Kafka and the log abstraction. What you will get out of it is a deeper understanding of the underlying architecture of the EOS framework in Kafka, what its limitations are, and how you can use it to solve problems.
Functional Operations (Functional Programming at Comcast Labs Connect)Susan Potter
Functional Operations: Packaging, system/configuration building, and testing infrastructure with [Nix] lambda
Maintaining configurations for different kinds of nodes and cloud resources in a [micro]service architecture can be an operational nightmare, especially if not managed with the application codebase. CI and CD job environments diverge from production configuration yielding their results unpredictable at best or produce false positives in the worst case. Code pushes to staging and production can have unintended consequences that can't be reasoned about before deploy and often can’t be inspected thoroughly on a dry run. Leading to unhappy users when problems do arise.
This session will demonstrate the use of the Nix and NixOS ecosystem to define and build packages in a referentially transparent way which can be leveraged as a solid foundation to configure systems and test multiple [virtual] machines with coordinated scenarios. We also look at how reliable packaging allows us to build a consistent CI/CD pipeline where upgrading your version of the JVM doesn't break your CI build servers for days.
Take a Jailbreak -Stunning Guards for iOS Jailbreak- by Kaoru OtsukaCODE BLUE
In this talk, I investigate several exploiting ideas for iOS kernel jailbreak using recently exposed vulnerabilities. Recently, Ian Beer found the following promising vulnerabilities:
CVE-2016-7637: Broken kernel mach port name ‘uref’ handling on iOS/MacOS can lead to privileged port name replacement in other processes,
CVE-2016-7644: XNU kernel UaF due to lack of locking in set_dp_control_port,
CVE-2016-7661: MacOS/iOS arbitrary port replacement in powerd.
However, naive combination of the above vulnerabilities cannot easily break recent mitigations implemented in iOS versions. Recent iOS provides the kernel level mitigations against exploitation such as kernel patch protection, sandboxing, AMFI(Apple Mobile File Integrity), MAC(Mandatory Access Control) policy, KASLR(Kernel ASLR) etc. These mitigations will be briefly explained.
Find out how to build decentralized, fault-tolerant, stateful application services using core concepts and techniques from the Amazon Dynamo paper using riak_core as a toolkit.
syzbot and the tale of million kernel bugsDmitry Vyukov
The root cause of most software exploits is bugs. Hardening, mitigations and containers are important, but they can't protect a system with thousands of bugs. In this presentation, Dmitry Vyukov will review the current [sad] situation with Linux kernel bugs and security implications based on their experience testing kernel for the past 3 years; overview a set of bug finding tools they are developing (syzbot, syzkaller, KASAN, KMSAN, KTSAN); and discuss problems and areas that require community help to improve the situation.
Agenda:
This talk will provide an in-depth review of the usage of canaries in the kernel and the interaction with userspace, as well as a short review of canaries and why they are needed in general so don't be afraid if you never heard of them.
Speaker:
Gil Yankovitch, CEO, Chief Security Researcher from Nyx Security Solutions
Kernel Recipes 2019 - CVEs are dead, long live the CVE!Anne Nicolas
For the Linux kernel, CVEs do not work at all given the rate of fixes being applied and rapidly backported and pushed to users through a huge variety of different ways. The average “request to fix” date for Linux CVEs is -100 days, showing that either no one cares about CVEs for Linux, or engineers are using them to game their internal release processes, or no one happens to notice when the kernel developers resolve an issue, or all of the above. This talk will go into the problems with CVEs when it comes to a fast moving project like Linux, and show the decentralized solution that we have been using for the past 14 years instead. All other open source projects are encouraged to use these same methods to help resolve the problems that CVEs have.
The lecture by Norman Feske for Summer Systems School'12.
Genode Compositions
SSS'12 - Education event, organized by ksys labs[1] in 2012, for students interested in system software development and information security.
Genode[2] - The Genode operating-system framework provides a uniform API for applications on top of 8 existing microkernels/hypervisors: Linux, L4ka::Pistachio, L4/Fiasco, OKL4, NOVA, Fiasco.OC, Codezero, and a custom kernel for the MicroBlaze architecture.
1. http://ksyslabs.org/
2. http://genode.org
Rising from non-existence a few short years ago, Node.js is already attracting the accolades and disdain enjoyed and endured by the Ruby and Rails community just a short time ago. It overtook Rails as the most popular Github repository in 2011 and was selected by InfoWorld for the Technology of the Year Award in 2012. This presentation explains the basic theory and programming model central to Node's approach and will help you understand the resulting benefits and challenges it presents. You can also watch this presentation at http://bit.ly/1362UGA
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...Akihiro Suda
Rootless mode is a technique to harden containers by running the container engine as a non-root user. The support for rootless mode has been merged into Docker since v19.03 (2019) and in Kubernetes since v1.22 (2021). However, setting up Rootless Kubernetes has been more challenging than setting up Rootless Docker due to its complexity. This session presents Usernetes Generation 2, a Kubernetes distribution that wraps Kubernetes in Rootless Docker for ease of setting up multi-node Rootless Kubernetes clusters. Unlike the original Usernetes (Generation 1) that was based on "Kubernetes The Hard Way", Usernetes Generation 2 supports kubeadm. Usernetes Generation 2 is similar to `kind` and `minikube`, however, unlike them Usernetes Generation 2 supports forming real multi-node clusters using Flannel (VXLAN) and it can be potentially used for production clusters. https://github.com/rootless-containers/usernetes
The MEW Workshop is now established as a leading national event dedicated to distributed high performance scientific computing. The principle objective is to encourage close contact between the research communities from the Mathematics, Chemistry, Physics and Materials Programmes of EPSRC and the major vendors.
Node.js at Joyent: Engineering for Productionjclulow
Joyent is one of the largest deployers of Node.js in production systems. In order to successfully deploy large-scale, distributed systems, we must understand the systems we build! For us, that means having first-class tools for debugging our software, and understanding and improving its performance.
Come on a whirlwind tour of the tools and techniques we use at Joyent as we build out large-scale distributed software with Node.js: from mdb for Post-Mortem Debugging, to Flame Graphs for performance analysis; from DTrace for dynamic, production-safe instrumentation and tracing, to JSON-formatted logging with Bunyan.
GridMate - End to end testing is a critical piece to ensure quality and avoid...ThomasParaiso2
End to end testing is a critical piece to ensure quality and avoid regressions. In this session, we share our journey building an E2E testing pipeline for GridMate components (LWC and Aura) using Cypress, JSForce, FakerJS…
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
3. Motivation
• Distributed memory clusters are becoming pervasive in
industry and academia
• Shells are the default login environment on these
systems
• Shell pipes are commonly used for composing extensible
unix commands.
• There has been no change to the syntax/semantics of
shell pipes since their invention over 30 years ago.
• Growing need to compose massively parallel jobs
quickly, using existing software
4. Extending Shells for Parallel
Computing
• Build a simple, powerful coordination layer at the Shell
• The coordination layer transparently manages the
parallelism in the workflow
• User specifies parallel computation as a dataflow graph
using extensions to the Shell
• Provides the ability to combine different tools and build
interesting parallel programs quickly.
5. Shell pipe extensions
• Pipeline fork
A | B on n procs
• Pipeline join
A on n procs | B
• Pipeline cycles
(++ n A)
• Pipeline key-value aggregation
A | B on keys
6. Parallel shell tasks extensions
> function foo()
{
echo “hello world”
}
> foo on all procs # foo() on all CPUs
> foo on all nodes # foo() on all nodes
stride
> foo on 10:2 procs # 10 tasks, 2 tasks on each node
span
> foo on 10:2:2 procs # 10 tasks, 2 tasks on alternative node
7. Composing data-flow graphs
• Example 1:
function B1() {}
B1
function B2() {}
A C
function B()
{ B2
if (($_ASPECT_TASKID == 0 )) ; then
B1
else
B2
endif
}
A | B on 2 procs | C
8. Composing data-flow graphs
• Example 2:
function map()
{
reduce
emit_tuple –k key –v value map
}
Key-value
function reduce() DHT
{
consume_tuple –k key –v value map reduce
num=${#value[@]}
for ((i=0; i < $num; i++)) ; do
# process key=$key, value=${value[$i]}
done
}
map on all procs | reduce on keys
10. Startup Overlay
• Script may have many instances requiring
startup of parallel tasks
• Motivation for overlay:
– Fast startup of parallel shell workers
– Handles node failures gracefully
• Two level hierarchy: sectors and proxies
• Overlay node addressing: 7 0
Compute node ID
Sector id Proxy id
11. Fault-Tolerance
• Proxy nodes monitor peers within sector, and
sector heads monitor peer sectors
• Node 0 maintains a list of available nodes in the
overlay in a master_node file
Overlay sector 0 Overlay sector 1
Proxy Node 3 Proxy Node 0 Proxy Node 6 Proxy Node 7
exec exec exec exec
Node 2 Node 1 Node 4 Node 5
Proxy Proxy Proxy Proxy
exec exec exec exec
master_node
18. 1. Process B pipes stdin into stdin_file
A | B on N procs
stdin BASH
stdout pipe (1)
aspect-agent B
stdin
A reader
stdin_file
19. 2. Constructs command files for each
task
A | B on N procs
stdin BASH
stdout pipe (1)
aspect-agent B
Cmd
stdin dispatcher
A reader (2)
stdin_file
Cmd
files
B
cat stdin_file | B
20. 3. 4. and 5. Execute command files in shell
workers and marshal results back to shell
A | B on N procs
stdin BASH
stdout pipe (1)
control
stdout
aspect-agent B
Cmd
stdin dispatcher
A reader I/O
(2) flusher
flusher MUX
stdin_file flusher
(3)
qu
eue
Cmd (5)
files Node
B Node
MUX Node
MUX
cat stdin_file | B MUX
Compute node (4)
Shell Shell
worker worker
B B
21. 6. Replay command files on failure
A | B on N procs
stdin BASH
stdout pipe (1)
control
stdout
aspect-agent B
Cmd
stdin dispatcher
A reader I/O
(2) flusher
flusher MUX
stdin_file flusher
replayer (3)
(6)
qu
e
Local compute node
ue
Cmd (5)
Shell Shell files Node
worker worker B Node
MUX Node
MUX
cat stdin_file | B MUX
Compute node (4)
B B
Shell Shell
worker worker
B B
23. 1. Agent inspects and hashes key
A | B on keys
pipe
BASH
control control
(1)
aspect-agent B
Key
A
dispatcher
24. 2. Routes key-value to compute node based
on key hash, and stored in hash table
A | B on keys
pipe
BASH
control control
(1)
aspect-agent B
Key
A
dispatcher
(2)
Node
MUX
Compute node Compute node
Distributed Hash Table
Hash Hash
gdbm table gdbm table
25. 3. Each node constructs command files to
pipe the key-value entry from its hash table
into process B
A | B on keys
pipe
BASH
control control
(1)
aspect-agent B
Key
A
dispatcher
(2)
Node
MUX
Compute node Compute node
Distributed Hash Table
Hash Hash
gdbm table gdbm table
emit_tuple emit_tuple
(3)
B B
26. 4. Results from the command files
execution are marshaled back to the shell
A | B on keys
pipe
BASH
control control
(1)
stdout
control
aspect-agent B
Key I/O MUX
A
dispatcher
(2)
Node
MUX (4)
Compute node Compute node
Distributed Hash Table
Hash Hash
gdbm table gdbm table
emit_tuple emit_tuple
(3)
B B
31. TeraSort benchmark:
Parallel bucket sort
• Step 1: spawn the data generator in parallel on
each compute node, partitioning data across N
nodes for task T if the first 2 bytes fall in the
range:
16 T 16 T + 1
2 ∗ N , 2 ∗
N
• Step 2: perform sort on local data on each node
• Step 3: merge results onto global file system
33. Related Work
• Ptolemy – embedded system design
• Yahoo Pipes – web content filtering
• Hadoop – Java implementation of
MapReduce
• Dryad - distributed DAG data flow
computation
34. Conclusion
• A debugger would be extremely helpful.
Working on bashdb implementation.
• Run-time simulator would be helpful to
predict performance based on
characteristics of cluster.
• Still thinking about how to incorporate our
extensions for named pipes (i.e. mkfifo).