This is the keynote talk i delivered at GeekCamp.SG 2014
The main purpose of the talk is to create an awareness, if not existent, in the community when it comes to choosing and wanting to building a distributed system.
This presentation is not meant to be a survey of distributed computing through the ages but hopefully it serves as a good starting point in which the journeyman can start from.
I want to thank Jonas, CTO of Typesafe, as his work in Akka strongly influenced my own and i hope it would help you in the way his work helped me.
Distributed computing deals with hardware and software systems containing more than one processing element or storage element, concurrent processes, or multiple programs, running under a loosely or tightly controlled regime. In distributed computing a program is split up into parts that run simultaneously on multiple computers communicating over a network. Distributed computing is a form of parallel computing, but parallel computing is most commonly used to describe program parts running simultaneously on multiple processors in the same computer. Both types of processing require dividing a program into parts that can run simultaneously, but distributed programs often must deal with heterogeneous environments, network links of varying latencies, and unpredictable failures in the network or the computers.
Distributed computing deals with hardware and software systems containing more than one processing element or storage element, concurrent processes, or multiple programs, running under a loosely or tightly controlled regime. In distributed computing a program is split up into parts that run simultaneously on multiple computers communicating over a network. Distributed computing is a form of parallel computing, but parallel computing is most commonly used to describe program parts running simultaneously on multiple processors in the same computer. Both types of processing require dividing a program into parts that can run simultaneously, but distributed programs often must deal with heterogeneous environments, network links of varying latencies, and unpredictable failures in the network or the computers.
Introduction to distributed systems
Architecture for Distributed System, Goals of Distributed system, Hardware and Software
concepts, Distributed Computing Model, Advantages & Disadvantage distributed system, Issues
in designing Distributed System,
Cluster computing is a type of computing where a group of several computers are linked together, allowing the entire group of computers to behave as if it were a single entity. There are a wide variety of different reasons why people might use cluster computing for various computer tasks. It s also used to make sure that a computing system will always be available. It is unknown when this cluster computing concept was first developed, and several different organizations have claimed to have invented it.
Please contact me to download this pres.A comprehensive presentation on the field of Parallel Computing.It's applications are only growing exponentially day by days.A useful seminar covering basics,its classification and implementation thoroughly.
Visit www.ameyawaghmare.wordpress.com for more info
Introduction to distributed systems
Architecture for Distributed System, Goals of Distributed system, Hardware and Software
concepts, Distributed Computing Model, Advantages & Disadvantage distributed system, Issues
in designing Distributed System,
Cluster computing is a type of computing where a group of several computers are linked together, allowing the entire group of computers to behave as if it were a single entity. There are a wide variety of different reasons why people might use cluster computing for various computer tasks. It s also used to make sure that a computing system will always be available. It is unknown when this cluster computing concept was first developed, and several different organizations have claimed to have invented it.
Please contact me to download this pres.A comprehensive presentation on the field of Parallel Computing.It's applications are only growing exponentially day by days.A useful seminar covering basics,its classification and implementation thoroughly.
Visit www.ameyawaghmare.wordpress.com for more info
From Mainframe to Microservice: An Introduction to Distributed SystemsTyler Treat
An introductory overview of distributed systems—what they are and why they're difficult to build. We explore fundamental ideas and practical concepts in distributed programming. What is the CAP theorem? What is distributed consensus? What are CRDTs? We also look at options for solving the split-brain problem while considering the trade-off of high availability as well as options for scaling shared data.
This is the Expert Q&A from 2600hz and Cloudant on Database in Telecom. If you are a service provider, MSP or anyone running a VoIP switch, you should definitely check this out.
The Reactive Principles: Eight Tenets For Building Cloud Native ApplicationsLightbend
In this presentation by Jonas Bonér, creator of Akka and founder/CTO of Lightbend, we review a set of eight Reactive Principles that enable the design and implementation of Cloud Native applications–applications that are highly concurrent, distributed, performant, scalable, and resilient, while at the same time conserving resources when deploying, operating, and maintaining them.
The Reactive Principles: Design Principles For Cloud Native ApplicationsJonas Bonér
Reactive Summit Keynote 2020: https://www.youtube.com/watch?v=e5kek8vx2ws
Abstract: Building applications for the cloud means embracing a radically different architecture than that of a traditional single-machine monolith, requiring new tools, practices, and design patterns. The cloud’s distributed nature brings its own set of concerns–building a Cloud Native, Edge Native, or Internet of Things (IoT) application means building and running a distributed system on unreliable hardware and across unreliable networks. In this keynote session by Jonas Bonér, creator of Akka, founder/CTO of Lightbend, and Chair of the Reactive Foundation, we’ll review a set of Reactive Principles that enable the design and implementation of Cloud Native applications–applications that are highly concurrent, distributed, performant, scalable, and resilient, while at the same time conserving resources when deploying, operating, and maintaining them.
Lightning talk: highly scalable databases and the PACELC theoremVishal Bardoloi
Lightning talk I gave at Headspring's Friday brown bag event on 3/17/17. Basically a summary of what I've been blogging about at bardoloi.com
The content of these slides are from much greater sources than mine; to go to the original sources, you can start by looking at the References section at the end.
Hard Truths About Streaming and Eventing (Dan Rosanova, Microsoft) Kafka Summ...confluent
Eventing and streaming open a world of compelling new possibilities to our software and platform designs. They can reduce time to decision and action while lowering total platform cost. But they are not a panacea. Understanding the edges and limits of these architectures can help you avoid painful missteps. This talk will focus on event driven and streaming architectures and how Apache Kafka can help you implement these. It will also discuss key tradeoffs you will face along the way from partitioning schemes to the impact of availability vs. consistency (CAP Theorem). Finally we’ll discuss some challenges of scale for patterns like Event Sourcing and how you can use other tools and even features of Kafka to work around them. This talk assumes a basic understanding of Kafka and distributed computing, but will include brief refresher sections.
Go Reactive: Building Responsive, Resilient, Elastic & Message-Driven SystemsJonas Bonér
Abstract:
The demands and expectations for applications have changed dramatically in recent years. Applications today are deployed on a wide range of infrastructure; from mobile devices up to thousands of nodes running in the cloud—all powered by multi-core processors. They need to be rich and collaborative, have a real-time feel with millisecond response time and should never stop running. Additionally, modern applications are a mashup of external services that need to be consumed and composed to provide the features at hand.
We are seeing a new type of applications emerging to address these new challenges—these are being called Reactive Applications. In this talk we will discuss four key traits of Reactive; Responsive, Resilient, Elastic and Message-Driven—how they impact application design, how they interact, their supporting technologies and techniques, how to think when designing and building them—all to make it easier for you and your team to Go Reactive.
Intended Audience:
Programmers, architects, CIO/CTOs and everyone with a desire to challenge the status quo and expand their horizons on how to tackle the current and future challenges in the computing industry.
Everything you always wanted to know about Distributed databases, at devoxx l...javier ramirez
Everything you always wanted to know about Distributed databases, at devoxx london, by javier ramirez, teowaki.
Basic concepts of distributed systems, such as consensus, gossip and infection protocols, vector clocks, sharding storage, so you can create highly available distributed systems
Tyler Treat
Workiva
NATS Meetup 3/22/16
• Embracing the reality of complex systems
• Using simplicity to your advantage
• Why NATS?
• How Workiva uses NATS
You can learn more about NATS at http://www.nats.io
Architecting for failure - Why are distributed systems hard?Markus Eisele
Devnexus 2017
As we architect our systems for greater demands, scale, uptime, and performance, the hardest thing to control becomes the environment in which we deploy and the subtle but crucial interactions between complicated systems. And microservices obviously are the way to go forward with those complicated systems. But what makes it so hard to build them? And why should you embrace failure instead of doing what we can do best: Preventing failure. This talk introduces you to the problem domain of a distributed system which consists of a couple of microservices. It shows how to build, deploy and orchestrate the chaos and introduces you to a couple of patterns to prevent and compensate failure.
Building a modern data platform with scala, akka, apache beamRaymond Tay
Gave a talk on at the Scala Meetup on 20 September 2018 on the subject of building a modern data platform with Scala, Akka, Apache Beam.
List of references are as follows:
- Dataflow/Apache Beam (streamingsystems.org/Slides/Eugene Kirpichov - STREAM 2016 Dataflow and Apache Beam.pdf)
- The Dataflow Model (https://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf)
- MillWheel (https://ai.google/research/pubs/pub41378)
- FlumeJava (https://ai.google/research/pubs/pub35650)
- Why Curiosity Matters (https://hbr.org/2018/09/curiosity)
- Spotify Scio (https://github.com/spotify/scio)
- Typelevel Cats (typelevel.org/cats)
- Verizon Quiver (https://github.com/Verizon/quiver)
- Streaming 101 (https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101)
- Streaming 102 (https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102)
- Beam vs Spark (https://cloud.google.com/dataflow/blog/dataflow-beam-and-spark-comparison)
- Hierarchical scheduling in diverse data center workloads (https://people.eecs.berkeley.edu/~alig/papers/h-drf.pdf)
- Beam comparison (https://github.com/dataArtisans/beam_comp)
- Dataflow Pipeline Execution Parameters (https://cloud.google.com/dataflow/pipelines/specifying-exec-params#setting-other-local-pipeline-options)
My talk for the Scala meetup at PayPal's Singapore office.
The intention is to focus on 3 things:
(a) two common functions in Apache Spark "aggregate" and "cogroup"
(b) Spark SQL
(c) Spark Streaming
The umbrella event is http://www.meetup.com/Singapore-Scala-Programmers/events/219613576/
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
13. There is no difference
between a slow node
and a dead node
14. Why do we NEED it?
Scalability
when the needs of the
system outgrows what a
single node can provide
15. Why do we NEED it?
Scalability
when the needs of the
system outgrows what a
single node can provide
Availability
Enabling resilience
when a node fails
21. THE FALLACIES OF
DISTRIBUTED COMPUTING
The network is reliable
The network is secure
Peter Deutsch and
other fellows
at Sun
Microsystems
22. THE FALLACIES OF
DISTRIBUTED COMPUTING
The network is reliable
The network is secure
2013 cost of cyber
crime study:
United States
23. THE FALLACIES OF
DISTRIBUTED COMPUTING
The network is reliable
Peter Deutsch and
The network is secure
other fellows
at Sun
The network is
Microsystems
homogeneous
24. THE FALLACIES OF
DISTRIBUTED COMPUTING
The network is reliable
Peter Deutsch and
The network is secure
other fellows
at Sun
The network is
Microsystems
homogeneous
Latency is zero
Ingo Rammer on
latency vs
bandwidth
25. THE FALLACIES OF
DISTRIBUTED COMPUTING
The network is reliable
Peter Deutsch and
The network is secure
other fellows
at Sun
The network is
Microsystems
homogeneous
Latency is zero
Bandwidth is infinite
26. THE FALLACIES OF
DISTRIBUTED COMPUTING
The network is reliable
Peter Deutsch and
The network is secure
other fellows
The network is
at Sun
homogeneous
Microsystems
Latency is zero
Bandwidth is infinite
Topology doesn’t change
27. THE FALLACIES OF
DISTRIBUTED COMPUTING
The network is reliable
The network is secure
Peter Deutsch and
The network is
other fellows
homogeneous
at Sun
Microsystems
Latency is zero
Bandwidth is infinite
Topology doesn’t change
There is one administrator
28. THE FALLACIES OF
DISTRIBUTED COMPUTING
The network is reliable
The network is secure
Peter Deutsch and
The network is homogeneous
other fellows
at Sun
Latency is zero
Microsystems
Bandwidth is infinite
Topology doesn’t change
There is one administrator
Transport cost is zero
29. THE FALLACIES OF
DISTRIBUTED COMPUTING
The network is reliable
The network is secure
Peter Deutsch and
The network is homogeneous
other fellows
at Sun
Latency is zero
Microsystems
Bandwidth is infinite
Topology doesn’t change
There is one administrator
Transport cost is zero
30. WHAT WAS TRIED IN THE
PAST?
DISTRIBUTED OBJECTS
REMOTE PROCEDURE CALL
DISTRIBUTED SHARED MUTABLE STATE
31. WHAT WAS ATTEMPTED?
DISTRIBUTED OBJECTS
REMOTE PROCEDURE CALL
DISTRIBUTED SHARED MUTABLE STATE
• Typically, programming constructs to “abstract” the fact
that there are local and distributed objects
• Ignores latencies
• Handles failures with this attitude → “i dunno what to
do…YOU (i.e. invoker) handle it”.
32. WHAT WAS ATTEMPTED?
DISTRIBUTED OBJECTS
REMOTE PROCEDURE CALL
DISTRIBUTED SHARED MUTABLE STATE
• Assumes the synchronous processing model
• Asynchronous RPC tries to model after
synchronous RPC…
33. WHAT WAS ATTEMPTED?
DISTRIBUTED OBJECTS
REMOTE PROCEDURE CALL
DISTRIBUTED SHARED MUTABLE STATE
• Found in Distributed Shared Memory Systems i.e.
“1” address space partitioned into “x” address spaces for “x” nodes
• e.g. JavaSpaces ⇒ Danger of 2 independent
processes to commit successfully.
36. The question really
is “How can we do
better?”
To start,we
need to
understand 2
results
37. FLP IMPOSSIBILITY
RESULT
The FLP result shows that in
an asynchronous model,
where one processor might
crash, there is no distributed
algorithm to solve the
consensus problem.
38. Consensus is a fundamental problem in fault-tolerant
distributed systems. Consensus involves multiple servers
agreeing on values. Once they reach a decision on a value,
that decision is final.
Typical consensus algorithms make progress when any
majority of their servers are available; for example, a cluster
of 5 servers can continue to operate even if 2 servers fail. If
more servers fail, they stop making progress (but will never
return an incorrect result).
Paxos (1989), Raft (2013)
39. CAP THEOREM
CAP CONJECTURE
(2000) established
as a theorem in 2002
Consistency
Availability
Partition Tolerance
40. CAP THEOREM
CAP CONJECTURE
(2000) established
as a theorem in 2002
Consistency
Availability
Partition Tolerance
• Consistency ⇒ all nodes should see the same data,
eventually
41. CAP THEOREM
CAP CONJECTURE
(2000) established
as a theorem in 2002
Consistency
Availability
Partition Tolerance
• Consistency ⇒ all nodes should see the same data,
eventually
• Availability ⇒ System is still working when node(s)
fails
42. CAP THEOREM
CAP CONJECTURE
(2000) established
as a theorem in 2002
Consistency
Availability
Partition Tolerance
• Consistency ⇒ all nodes should see the same data,
eventually
• Availability ⇒ System is still working when node(s)
fails
43. CAP THEOREM
CAP CONJECTURE
(2000) established
as a theorem in 2002
Consistency
Availability
Partition Tolerance
• Consistency ⇒ all nodes should see the same data,
eventually
• Availability ⇒ System is still working when node(s)
fails • Partition-Tolerance ⇒ System is still working on
arbitrary message loss
45. How does knowing
FLP and CAP help me?
It’s really about asking which
would you sacrifice when a
system fails? Consistency
or Availability?
46. You can choose C over A
It needs to preserve reads
and writes i.e. linearizability
i.e. A read will return the last
completed write (on ANY
replica) e.g. Two-Phase-Commit
52. Replication is strongly
related to fault-tolerance
Fault-tolerance relies
on reaching consensus
Paxos (1989), ZAB, Raft
(2013)
53. Consistent Hashing is
used often PARTITIONING
and REPLICATION
C-H commonly used in load-balancing
web objects. In
DynamoDB Cassandra, Riak and
Memcached
54. • If your problem can fit into memory of a
single machine, you don’t need a
distributed system
• If you really need to design / build a
distributed system, the following helps:
• Design for failure
• Use the FLP and CAP to critique
systems
• Algorithms that work for single-node
may not work in distributed mode
• Avoid coordination of nodes
• Learn to estimate your capacity
56. REFERENCES
• http://queue.acm.org/detail.cfm?id=2655736 “The network is
reliable”
• http://cacm.acm.org/blogs/blog-cacm/83396-errors-in-database-systems-
eventual-consistency-and-the-cap-theorem/fulltext
• http://mvdirona.com/jrh/talksAndPapers/JamesRH_Lisa.pdf
• “Towards Robust Distributed Systems,” http://
www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf
• “Why Do Computers Stop and What Can be Done About It,”
Tandem Computers Technical Report 85.7, Cupertino, Ca., 1985.
http://www.hpl.hp.com/techreports/tandem/TR-85.7.pdf
• http://www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-
ladis2009.pdf
• http://codahale.com/you-cant-sacrifice-partition-tolerance/
• http://dl.acm.org/citation.cfm?id=258660 “Consistent hashing
and random trees: distributed caching protocols for relieving hot
spots on the World Wide Web”