This document discusses Knewton's use of ZooKeeper and PettingZoo to implement distributed machine learning on a Python cluster. It begins by explaining what ZooKeeper is and how it provides services for distributed synchronization. It then discusses the state of ZooKeeper libraries for Python, including incomplete bindings and lack of high-level recipes. PettingZoo is introduced as Knewton's library that implements common ZooKeeper recipes for Python, allowing their machine learning models to be sharded and scaled across multiple machines. Distributed discovery, distributed bags, leader queues, and role matching are highlighted as key recipes that enable dynamic reconfiguration and load balancing of their distributed system.
This talk covers why Apache Zookeeper is a good fit for coordinating processes in a distributed environment, prior Python attempts at a client and the current state of the art Python client library, how unifying development efforts to merge several Python client libraries has paid off, features available to Python processes, and how to gracefully handle failures in a set of distributed processes.
Distributed system coordination by zookeeper and introduction to kazoo python...Jimmy Lai
Zookeeper is a coordination tool to let people build distributed systems easier. In this slides, the author summarizes the usage of zookeeper and provides Kazoo Python library as example.
So we're running Apache ZooKeeper. Now What? By Camille Fournier Hakka Labs
The ZooKeeper framework was originally built at Yahoo! to make it easy for the company’s applications to access configuration information in a robust and easy-to-understand way, but it has since grown to offer a lot of features that help coordinate work across distributed clusters. Apache Zookeeper became a de-facto standard for coordination service and used by Storm, Hadoop, HBase, ElasticSearch and other distributed computing frameworks.
Basically everything you need to get started on your Zookeeper training, and setup apache Hadoop high availability with QJM setup with automatic failover.
This talk covers why Apache Zookeeper is a good fit for coordinating processes in a distributed environment, prior Python attempts at a client and the current state of the art Python client library, how unifying development efforts to merge several Python client libraries has paid off, features available to Python processes, and how to gracefully handle failures in a set of distributed processes.
Distributed system coordination by zookeeper and introduction to kazoo python...Jimmy Lai
Zookeeper is a coordination tool to let people build distributed systems easier. In this slides, the author summarizes the usage of zookeeper and provides Kazoo Python library as example.
So we're running Apache ZooKeeper. Now What? By Camille Fournier Hakka Labs
The ZooKeeper framework was originally built at Yahoo! to make it easy for the company’s applications to access configuration information in a robust and easy-to-understand way, but it has since grown to offer a lot of features that help coordinate work across distributed clusters. Apache Zookeeper became a de-facto standard for coordination service and used by Storm, Hadoop, HBase, ElasticSearch and other distributed computing frameworks.
Basically everything you need to get started on your Zookeeper training, and setup apache Hadoop high availability with QJM setup with automatic failover.
Slides for presentation on ZooKeeper I gave at Near Infinity (www.nearinfinity.com) 2012 spring conference.
The associated sample code is on GitHub at https://github.com/sleberknight/zookeeper-samples
[Download the slide to get the entire talk in the form of presentation note embedded in the ppt] Apache ZooKeeper is the chosen leader in distributed coordination. In this talk, I have explored the atomic elements of Apache ZooKeeper, how it fits everything together and some of its popular use cases. For ZooKeeper simplicity is the key and as a consumer of the API, our imagination enables us to push the limits of the ZooKeeper world.
This is a talk that I gave at the San Francisco DevOps meetup on 9/29/15. I talk about how Yelp performs service discovery using SmartStack and Docker.
This is Apache ZooKeeper session.
ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.
By the end of this presentation you should be fairly clear about Apache ZooKeeper.
To watch the video or know more about the course, please visit
http://www.knowbigdata.com/page/big-data-and-hadoop-online-instructor-led-training
ZooKeeper - wait free protocol for coordinating processesJulia Proskurnia
ZooKeeper is a service for coordinating processes within distributed systems. Stress test of the tool was applied. Reliable Multicast and Dynamic LogBack system Configuration management were implemented with ZooKeeper.
More details: http://proskurnia.in.ua/wiki/zookeeper_research
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
Big Data with Hadoop & Spark Training: http://bit.ly/2kvXlPd
This CloudxLab Introduction to Apache ZooKeeper tutorial helps you to understand ZooKeeper in detail. Below are the topics covered in this tutorial:
1) Data Model
2) Znode Types
3) Persistent Znode
4) Sequential Znode
5) Architecture
6) Election & Majority Demo
7) Why Do We Need Majority?
8) Guarantees - Sequential consistency, Atomicity, Single system image, Durability, Timeliness
9) ZooKeeper APIs
10) Watches & Triggers
11) ACLs - Access Control Lists
12) Usecases
13) When Not to Use ZooKeeper
How to run system administrator recruitment process? By creating platform based on open source parts in just 2 nights! I gave this talk in Poland / Kraków OWASP chapter meeting on 17th October 2013 at our local Google for Entrepreneurs site. It's focused on security and also shows how to create recruitment process in CTF / challenge way.
This story covers mostly security details of this whole platform. There's great chance, that I will give another talk about this system but this time focusing on technical details. Stay tuned ;)
Docker and Maestro for fun, development and profitMaxime Petazzoni
Presentation on MaestroNG, an orchestration and management tool for multi-host container deployments with Docker.
#lspe meetup, February 20th, 2014 at Yahoo!'s URL café.
.NET developers have a lot of options when it comes to databases these days. Apache Cassandra is a scalable, fault-tolerant database that has already found its way into more than 25% of the Fortune 100 and continues to grow in popularity. But what makes it different from the myriad of other options available? In this talk, we’ll take a deep dive into Cassandra and learn about:
- Cassandra’s internals and how it works
- CQL (the SQL-like query language for Cassandra)
- Data Modeling like a pro
- Tools available for developers
- Writing .NET code that talks to Cassandra
If there’s time and interest, we’ll finish up with how some companies are already using Cassandra to power services you probably interact with in your daily life. You’ll leave with all the tools you need to start build highly available .NET applications and services on top of Cassandra.
Centralized Application Configuration with Spring and Apache ZookeeperRyan Gardner
From talk given at Spring One 2gx Dallas, 2014
Application configuration is an evolution. It starts as a hard-coded strings in your application and hopefully progresses to something external, such as a file or system property that can be changed without deployment. But what happens when other enterprise concerns enter the mix, such as audit requirements or access control around who can make changes? How do you maintain the consistency of values across too many application servers to manage at one time from a terminal window? The next step in the application configuration evolution is centralized configuration that can be accessed by your applications as they move through your various environments on their way to production. Such a service transfers the ownership of configuration from the last developer who touched the code to a well-versed application owner who is responsible for the configuration of the application across all environments. At Dealer.com, we have created one such solution that relies on Apache ZooKeeper to handle the storage and coordination of the configuration data and Spring to handle to the retrieval, creation and registration of configured objects in each application. The end result is a transparent framework that provides the same configured objects that could have been created using a Spring configuration, configuration file and property value wiring. This talk will cover both the why and how of our solution, with a focus on how we leveraged the powerful attributes of both Apache ZooKeeper and Spring to rid our application of local configuration files and provide a consistent mechanism for application configuration in our enterprise.
Reactive Programming, Traits and Principles. What is Reactive, where does it come from, and what is it good for? How does it differ from event driven programming? It only functional?
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016Zabbix
At DBC we are running docker and other container types in a mesos/marathon cluster environment. I will demonstrate how we collect statistics, logs etc. and monitor this environment, showing configuration examples, data flows and templates.
Some of the covered topics:
- Mesos master and agents
- Marathon Framework
- Docker engine
- Containers
- Zookeeper
- Elasticserach/ELK
Introduction to ZooKeeper - TriHUG May 22, 2012mumrah
Presentation given at TriHUG (Triangle Hadoop User Group) on May 22, 2012. Gives a basic overview of Apache ZooKeeper as well as some common use cases, 3rd party libraries, and "gotchas"
Demo code available at https://github.com/mumrah/trihug-zookeeper-demo
Slides for presentation on ZooKeeper I gave at Near Infinity (www.nearinfinity.com) 2012 spring conference.
The associated sample code is on GitHub at https://github.com/sleberknight/zookeeper-samples
[Download the slide to get the entire talk in the form of presentation note embedded in the ppt] Apache ZooKeeper is the chosen leader in distributed coordination. In this talk, I have explored the atomic elements of Apache ZooKeeper, how it fits everything together and some of its popular use cases. For ZooKeeper simplicity is the key and as a consumer of the API, our imagination enables us to push the limits of the ZooKeeper world.
This is a talk that I gave at the San Francisco DevOps meetup on 9/29/15. I talk about how Yelp performs service discovery using SmartStack and Docker.
This is Apache ZooKeeper session.
ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.
By the end of this presentation you should be fairly clear about Apache ZooKeeper.
To watch the video or know more about the course, please visit
http://www.knowbigdata.com/page/big-data-and-hadoop-online-instructor-led-training
ZooKeeper - wait free protocol for coordinating processesJulia Proskurnia
ZooKeeper is a service for coordinating processes within distributed systems. Stress test of the tool was applied. Reliable Multicast and Dynamic LogBack system Configuration management were implemented with ZooKeeper.
More details: http://proskurnia.in.ua/wiki/zookeeper_research
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLabCloudxLab
Big Data with Hadoop & Spark Training: http://bit.ly/2kvXlPd
This CloudxLab Introduction to Apache ZooKeeper tutorial helps you to understand ZooKeeper in detail. Below are the topics covered in this tutorial:
1) Data Model
2) Znode Types
3) Persistent Znode
4) Sequential Znode
5) Architecture
6) Election & Majority Demo
7) Why Do We Need Majority?
8) Guarantees - Sequential consistency, Atomicity, Single system image, Durability, Timeliness
9) ZooKeeper APIs
10) Watches & Triggers
11) ACLs - Access Control Lists
12) Usecases
13) When Not to Use ZooKeeper
How to run system administrator recruitment process? By creating platform based on open source parts in just 2 nights! I gave this talk in Poland / Kraków OWASP chapter meeting on 17th October 2013 at our local Google for Entrepreneurs site. It's focused on security and also shows how to create recruitment process in CTF / challenge way.
This story covers mostly security details of this whole platform. There's great chance, that I will give another talk about this system but this time focusing on technical details. Stay tuned ;)
Docker and Maestro for fun, development and profitMaxime Petazzoni
Presentation on MaestroNG, an orchestration and management tool for multi-host container deployments with Docker.
#lspe meetup, February 20th, 2014 at Yahoo!'s URL café.
.NET developers have a lot of options when it comes to databases these days. Apache Cassandra is a scalable, fault-tolerant database that has already found its way into more than 25% of the Fortune 100 and continues to grow in popularity. But what makes it different from the myriad of other options available? In this talk, we’ll take a deep dive into Cassandra and learn about:
- Cassandra’s internals and how it works
- CQL (the SQL-like query language for Cassandra)
- Data Modeling like a pro
- Tools available for developers
- Writing .NET code that talks to Cassandra
If there’s time and interest, we’ll finish up with how some companies are already using Cassandra to power services you probably interact with in your daily life. You’ll leave with all the tools you need to start build highly available .NET applications and services on top of Cassandra.
Centralized Application Configuration with Spring and Apache ZookeeperRyan Gardner
From talk given at Spring One 2gx Dallas, 2014
Application configuration is an evolution. It starts as a hard-coded strings in your application and hopefully progresses to something external, such as a file or system property that can be changed without deployment. But what happens when other enterprise concerns enter the mix, such as audit requirements or access control around who can make changes? How do you maintain the consistency of values across too many application servers to manage at one time from a terminal window? The next step in the application configuration evolution is centralized configuration that can be accessed by your applications as they move through your various environments on their way to production. Such a service transfers the ownership of configuration from the last developer who touched the code to a well-versed application owner who is responsible for the configuration of the application across all environments. At Dealer.com, we have created one such solution that relies on Apache ZooKeeper to handle the storage and coordination of the configuration data and Spring to handle to the retrieval, creation and registration of configured objects in each application. The end result is a transparent framework that provides the same configured objects that could have been created using a Spring configuration, configuration file and property value wiring. This talk will cover both the why and how of our solution, with a focus on how we leveraged the powerful attributes of both Apache ZooKeeper and Spring to rid our application of local configuration files and provide a consistent mechanism for application configuration in our enterprise.
Reactive Programming, Traits and Principles. What is Reactive, where does it come from, and what is it good for? How does it differ from event driven programming? It only functional?
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016Zabbix
At DBC we are running docker and other container types in a mesos/marathon cluster environment. I will demonstrate how we collect statistics, logs etc. and monitor this environment, showing configuration examples, data flows and templates.
Some of the covered topics:
- Mesos master and agents
- Marathon Framework
- Docker engine
- Containers
- Zookeeper
- Elasticserach/ELK
Introduction to ZooKeeper - TriHUG May 22, 2012mumrah
Presentation given at TriHUG (Triangle Hadoop User Group) on May 22, 2012. Gives a basic overview of Apache ZooKeeper as well as some common use cases, 3rd party libraries, and "gotchas"
Demo code available at https://github.com/mumrah/trihug-zookeeper-demo
Kubernetes and CoreOS @ Athens Docker meetupMist.io
Using Kubernetes and CoreOS to increase scalability and availability. Presentation at the Athens Docker meetup http://www.meetup.com/Docker-Athens/events/226277352/
No production system is complete without a way to monitor it. In software, we define observability as the ability to understand how our system is performing. This talk dives into capabilities and tools that are recommended for implementing observability when running K8s in production as the main platform today for deploying and maintaining containers with cloud-native solutions.
We start by introducing the concept of observability in the context of distributed systems such as K8s and the difference with monitoring. We continue by reviewing the observability stack in K8s and the main functionalities. Finally, we will review the tools K8s provides for monitoring and logging, and get metrics from applications and infrastructure.
Between the points to be discussed we can highlight:
-Introducing the concept of observability
-Observability stack in K8s
-Tools and apps for implementing Kubernetes observability
-Integrating Prometheus with OpenMetrics
A brief introduction on Kubernetes's main concepts. Kubernetes is a container orchestrator developed by Google in 2014 and donate for the CNCF in 2015.
A Closer Look at Kubernetes Pods and Replica SetsJanakiram MSV
Pods are the basic building blocks of Kubernetes. Replica Sets enable Pods to scale horizontally. This webinar will focus on the architecture of Pods and Replica Sets. We will walk you through the best practices of packaging multiple containers as Pods and scaling them.
Comparison between zookeeper, etcd 3 and other distributed coordination systemsImesha Sudasingha
This is a comparison between popular distributed coordination systems including zookeeper (which powers Apache Hadoop), etcd 3 (which powers Kubernetes), consul and hazelcast. This comparison was made in second half of 2016. Therefore, please note that some of these technologies have improved immensely over the time. Anyway, this presentation will provide an initial idea of each distributed coordination systems.
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUpJosé Román Martín Gil
Apache Kafka is the most used data streaming broker by companies. It could manage millions of messages easily and it is the base of many architectures based in events, micro-services, orchestration, ... and now cloud environments. OpenShift is the most extended Platform as a Service (PaaS). It is based in Kubernetes and it helps the companies to deploy easily any kind of workload in a cloud environment. Thanks many of its features it is the base for many architectures based in stateless applications to build new Cloud Native Applications. Strimzi is an open source community that implements a set of Kubernetes Operators to help you to manage and deploy Apache Kafka brokers in OpenShift environments.
These slides will introduce you Strimzi as a new component on OpenShift to manage your Apache Kafka clusters.
Slides used at OpenShift Meetup Spain:
- https://www.meetup.com/es-ES/openshift_spain/events/261284764/
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kevin Lynch
In this presentation I talk about our motivation to converting our microservices to run on Kubernetes. I discuss many of the technical challenges we encountered along the way, including networking issues, Java issues, monitoring and alerting, and managing all of our resources!
QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemonsaspyker
Disenchantment is a Netflix show following the medieval misadventures of a hard-drinking princess, her feisty elf, and her personal demon. In this talk, we will follow the story of Netflix’s container management platform, Titus, which powers critical aspects of the Netflix business (video encoding & streaming, big data, recommendations & machine learning, and other workloads). We’ll cover the challenges growing Titus from 10’s to 1000’s of workloads. We’ll talk about our feisty team’s work across container runtimes, scheduling & control plane, and cloud infrastructure integration. We’ll talk about the demons we’ve found on this journey covering operability, security, reliability and performance.
Richard Boyd, Civitas Learning
Google Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications.
What does that mean? If you’re thinking about moving towards a container-centric world then you should consider using Kubernetes. In this talk I’ll go through the architecture of Kubernetes and give an overview of how it works, capping it off with a tech demo and light Q and A.
Even wondered what Kubernetes was all about? Ever felt intimidated trying to understand the difference between Daemon sets and Replica sets? Well this presentation is for you.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
A Python Petting Zoo
1. A Python Petting Zoo
Python and ZooKeeper
Devon Jones
Senior Software Engineer
knewt.ly/pettingzoo-slides
2. Knewton is an education technology company with a goal of
bringing adaptive education to the masses. Knewton makes it
possible to break courses into tiny parts that are delivered to
each student as personalized, real-time recommendations.
Knewton recommends the best work for an individual learner by
calculating data on what we know about a student, similar
students, the learning objective, and the content itself at a given
point in time.
The Knewton platform today has tens of thousands of students,
will have over 600k students starting Sept. 2012, and will soon
have millions of students.
3. What is this about?
● What is ZooKeeper, how is it useful
● State of ZooKeeper on python
● The release of PettingZoo, Knewton's
ZooKeeper recipes for managing a
distributed machine learning cluster
4. What is ZooKeeper?
According to ZooKeeper's Apache Site:
ZooKeeper is a centralized service for maintaining configuration information,
naming, providing distributed synchronization, and providing group services. All
of these kinds of services are used in some form or another by distributed
applications. Each time they are implemented there is a lot of work that goes
into fixing the bugs and race conditions that are inevitable. Because of the
difficulty of implementing these kinds of services, applications initially usually
skimp on them, which make them brittle in the presence of change and difficult
to manage. Even when done correctly, different implementations of these
services lead to management complexity when the applications are deployed.
Right, but what is it?
5. What is ZooKeeper?
ZooKeeper is a distributed filesystem based on
the PAXOS algorithim with a few valuable
features.
● No single point of failure
● Strictly ordered, observable state
● Events
● Sequential and ephemeral primitives
● ACLs
7. What is ZooKeeper for?
ZooKeeper is a platform for creating protocols
for synchronization of distributed systems.
Uses include:
● Distributed configuration
● Queues
● Implementation of distributed concurrency
primitives such as locks, barriers, latches,
counters, etc.
In short, it's a system for managing shared
state between distributed systems.
9. Watches
● Can be set by any read operation
● Will fire an event to the client who set them
once and only once
● Can be set on nodes or on their data.
● Will always be sent to you in a fixed order
10. Sequential Nodes
● Appends a monotonically increasing number
to the end of the znode (file)
● Can be used directly for leader election
● Provides communication of server's view of
ordering
11. Ephemeral Nodes
● Only exist for the life of a connection
● If your connection does not respond to a
keepalive request, will disappear
● Used to ensure reliability against service
disruption for most recipes
● Used to trigger events, such as
reconfiguration if a service goes down that
published discovery configs
13. Recipes
As a community, we need well tested versions
of these recipes as well as other valuable
protocols built on ZooKeeper
14. State of ZooKeeper
● Very low level
● Coding for it is very complex
● Lots of edge cases
● ZooKeeper needs high level, well tested
libraries
● Very few complete, high level solutions exist
15. ZooKeeper Libraries
The first high level library with significant recipe
implementations has emerged from Netflix. It's
name is Curator.
Unfortunately for us, it's in Java, not Python.
16. Curator
https://github.com/Netfix/curator
● State
○ High level api
○ Non-Resilient Client
○ Documentation
○ Tests, Embeddable ZooKeeper for testing
● Recipes
○ Leader Latch, Leader Election
○ Multiple Locks and Semaphores
○ Multiple Queues
○ Barrier, Double Barrier
○ Shared Counter/Distributed Atomic Long
17. State of ZooKeeper on Python
● Current state is in a lot of flux
● Went from only low level bindings to a
number of incomplete bindings in first half of
2012
● So far nothing like Curator has emerged (but
it appears to be brewing)
18. One of the top ranked results for 'Python ZooKeeper'
19. Summary of Python ZooKeeper
bindings
● There are about 10 presently
● Many suffer from not handling known edge
cases in ZooKeeper
● Some suffer problems with resilient
connections
● The following is derived from the python
ZooKeeper binding census of Ben Bangert
20. Official Bindings
● State
○ Complete access to the ZooKeeper C bindings
○ Full of sharp edges
○ Not a resilient client
○ No recipes
○ Threads communication with ZooKeeper in a C
thread
○ Foundation for most other libraries
○ Very low level
24. State of ZooKeeper on Python:
Kind of a Mess
A project exists to merge some of the high level
bindings in an attempt to create a python
equivalent of Curator: https://github.
com/python-zk
Started by Ben Bangert to merge Kazoo & zc.
zk with an attempt to implement all Curator
recipes.
25. PettingZoo
https://github.com/Knewton/pettingzoo-python
● State
○ Relies on zc.zk
○ Documented, doc strings
○ Tests (mock ZooKeeper)
○ All recipes implemented in a Java version as well
● Recipes
○ Distributed Config
○ Distributed Bag
○ Leader Queue
○ Role Match
26. PettingZoo
● In heavy development
● Distributed Discovery, Distributed Bag are
well tested and used in production
● Leader Queue and Role Match are tested,
but undeployed
● PettingZoo will be ported to or merged with
the kazoo effort when it is ready
27. Our Problem
Need to be able to do stream processing of
observations of student interactions with course
material. This involves multiple models that
have interdependent parameters. This requires:
● Sharding along different axes dependent
upon the models
● Subscriptions between models for
parameters
● Dynamic reconfiguration of the environment
to deal with current load
28. Distributed Discovery
Allows services in a dynamic, distributed
environment to be able to be quickly alerted of
service address changes.
● Most service discovery recipes only contain
host:port, Distributed Discovery can share
arbitrary data as well (using yaml)
● Can handle load balancing through random
selection of config
● Handles rebalancing on pool change
29. How does this help us scale?
● Makes discovery of dependencies simple
● Adds to reliability of system by quickly
removing dead resources
● Makes dynamic reconfiguration simple as
additional resources become available
32. Distributed Bag
Recipe for a distributed bag (dbag) that allows
processes to share a collection. Any
participant can post or remove data, alerting all
others.
● Used as a part of Role Match
● Useful for any case where
processes need to share
configuration determined at
runtime
33. How does this help us scale?
● Can quickly alert processes as to who is
subscribing to them
● Reduces load by quickly yanking dead
subscriptions
● Provides event based subscriptions, making
implementation simpler
34. Distributed Bag
● Sequential items
Item
contain the actual data
Items
1Item
2Item ● Can be ephemeral
3 ● Clients set delete watch
on discrete items
<bag> ● Token is set to id of
highest item
● Clients set a child
Tokens
Token watch on the "Tokens"
3
node
● Can determine exact
adds and deletes with a
constant number of
messages per delta
36. Leader Queue
Recipe is similar to Leader Election, but makes
it easy to monitor your spare capacity.
● Used in Role Match
● As services are ready to do work, they
create an ephemeral, sequential node in the
queue.
● Any member always knows if either they are
in the queue or at the front
● Watch lets leader know when it is elected
37. How does this help us scale?
● Gives a convenient method of assigning
work
● Makes monitoring current excess capacity
easy
38. Leader Queue
● Candidates register with
sequential, ephemeral nodes
C_1
● Candidate sets delete watch
on predecessor
● Candidate is elected when it
is the smallest node
<queue> C_3 ● When elected, candidate
takes over its new role
● When ready, candidate
removes itself from the
queue
C_4
● Only one candidate needs to
call get_children upon any
node exiting
39. Leader Queue Usage Example
from pettingzoo.leader_queue import LeaderQueue, Candidate
from pettingzoo.utils import connect_to_zk
class SomeCandidate (Candidate):
def on_elected (self):
<do something sexy>
conn = connect_to_zk( 'localhost:2181' )
leaderq = LeaderQueue(conn)
leaderq.add_candidate(SomeCandidate())
40. Role Match
Allows systems to expose needed, long lived
jobs, and for services to take over those jobs
until all are filled.
● Dbag used to expose jobs
● Leader queue used to hold applicants
● Records which jobs are presently held with
ephemeral node
● Lets a new process take over if a worker
dies
● We use it for sharding/segmentation to
dynamically adjust the shards as needed
41. How does this help us scale?
● Core of our ability to dynamically adjust
shards
● Lets the controlling process adjust problem
spaces and have those tasks become
automatically filled
● Monitoring is easy to identify who is working
on what, when
42. Role Match
job
Distributed ● Leader monitors for
Bag
open jobs
● Job holder creates
an ephemeral
assignment
<match> applicant Leader Queue
● Assignment id
matches job id,
indicating that it is
Assgn
claimed
assignment
1
A_2
43. Future: Distributed Config
Next project is Distributed Config.
● Allows service config to be recorded and
changed with a yaml config
● Every process that connects creates a child
node of the appropriate service
● Any change in a child node's config
overrides the overall service config for that
process
● Any change of the parent or child fires a
watch to let the process know that it's config
has changed
45. Appendix: Official Bindings
● State
○ Complete access to the ZooKeeper C bindings
○ Full of sharp edges
○ Not a resilient client
○ No recipes
○ Threads communication with ZooKeeper in a C
thread
○ Foundation for most other libraries
○ Very low level