This talk is about Taskerman, a distributed cluster task manager built on top of AWS SQS, Zookeeper and Yelp PaaSTA. The talk was given at Imperial College, London as part of its 'Application of Computing in Industry' series: http://www.imperial.ac.uk/computing/industry/aci/yelp/
Principles in Data Stream Processing | Matthias J Sax, ConfluentHostedbyConfluent
Data stream processing is, for many of us, a new paradigm with which you process data and build applications. In this talk, we will take you on a journey through the theoretical foundations of stream processing and discuss the underlying principles and unique problems that need to be addressed. What actually is a data stream anyway? And how do I use it? How do streams relate to application state and when do I use the one or the other?
ksqlDB and Kafka Streams are both, at their core, designed to help build stream processing applications and we will explain how stream processing principles are reflected in the design of each system and what trade-offs were chosen (and - more importantly! - why). Finally, we take a look into the future how the stream processing space, and in particular ksqlDB and Kafka Streams, may evolve over the next few years as we outline extensions and improvements to the underlying conceptual model. So, bring your thinking hats and notepads and prepare to learn WHY these systems are the way they are!
Reactive applications are becoming a de-facto industry standard and, if employed correctly, toolkits like Lightbend Reactive Platform make the implementation easier than ever. But design of these systems might be challenging as it requires particular mindset shift to tackle problems we might not be used to.
In this talk, we’re going to discuss the most common things I’ve seen in the field that prevented applications working as expected. I’d like to talk about typical pitfalls that might cause problems, about trade-offs that might not be fully understood and important choices that might be overlooked. These include persistent actors pitfalls, tackling of network partitions, proper implementations of graceful shutdown or distributed transactions, trade-offs of micro-services or actors and more.
This talk should be interesting for anyone who is thinking about, implementing, or has already deployed a reactive application. My goal is to provide a comprehensive explanation of common problems to be sure they won’t be repeated by fellow developers. The talk is a little bit more focused on the Lightbend platform but understanding of the concepts we are going to talk about should be beneficial for everyone interested in this field.
Principles in Data Stream Processing | Matthias J Sax, ConfluentHostedbyConfluent
Data stream processing is, for many of us, a new paradigm with which you process data and build applications. In this talk, we will take you on a journey through the theoretical foundations of stream processing and discuss the underlying principles and unique problems that need to be addressed. What actually is a data stream anyway? And how do I use it? How do streams relate to application state and when do I use the one or the other?
ksqlDB and Kafka Streams are both, at their core, designed to help build stream processing applications and we will explain how stream processing principles are reflected in the design of each system and what trade-offs were chosen (and - more importantly! - why). Finally, we take a look into the future how the stream processing space, and in particular ksqlDB and Kafka Streams, may evolve over the next few years as we outline extensions and improvements to the underlying conceptual model. So, bring your thinking hats and notepads and prepare to learn WHY these systems are the way they are!
Reactive applications are becoming a de-facto industry standard and, if employed correctly, toolkits like Lightbend Reactive Platform make the implementation easier than ever. But design of these systems might be challenging as it requires particular mindset shift to tackle problems we might not be used to.
In this talk, we’re going to discuss the most common things I’ve seen in the field that prevented applications working as expected. I’d like to talk about typical pitfalls that might cause problems, about trade-offs that might not be fully understood and important choices that might be overlooked. These include persistent actors pitfalls, tackling of network partitions, proper implementations of graceful shutdown or distributed transactions, trade-offs of micro-services or actors and more.
This talk should be interesting for anyone who is thinking about, implementing, or has already deployed a reactive application. My goal is to provide a comprehensive explanation of common problems to be sure they won’t be repeated by fellow developers. The talk is a little bit more focused on the Lightbend platform but understanding of the concepts we are going to talk about should be beneficial for everyone interested in this field.
An Introduction to Rearview - Time Series Based MonitoringVictorOps
Jeff Simpson, senior software engineer at VictorOps, delivered this presentation at the Frontrange Alerting & Monitoring meetup...along with an awesome live demo.
Presentation slides from DevConf.cz 2017
Challenges, take-aways and recommendations on scaling up OpenShift's logging and metrics stack.
Authors:
Ricardo Lourenço:
https://www.linkedin.com/in/ricardopereira4it/
Elvir Kuric
https://www.linkedin.com/in/elvirkuric/
A key feature when monitoring and debugging any Cloud infrastructure is to provide the ability to trace, track, and collate all the individual, discrete steps that compose an event. A typical resource action in OpenStack is often a combination of smaller tasks -- which given the distributed nature of OpenStack -- can fail at unpredictable points in the workflow. By collecting the appropriate events, operators can view all events within Ceilometer, filter on a failed action and trace back the history of related events to spot anomalies or errors. In this talk, we provide an overview of the recent enhancements made in Ceilometer to support the collection of event notifications from OpenStack services. We will describe: how events are processed, transformed and stored in Ceilometer; how you can derive metrics from events; and how it’s possible to track the events of a resource and analyse where errors occur.
Security Monitoring for big Infrastructures without a Million Dollar budgetJuan Berner
Nowadays in an increasingly more complex and dynamic network its not enough to be a regex ninja and storing only the logs you think you might need. From network traffic to custom logs you won't know which logs will be crucial to stop the next attacker, and if you are not planning to spend a half of your security budget in a commercial solution we will show you a way to building you own SIEM with open source. The talk will go from how to build a powerful logging environment for your organization to scaling on the cloud and storing everything forever. We will walk through how to build such a system with open source solutions as Elasticsearch and Hadoop, and creating your own custom monitoring rules to monitor everything you need. The talk will also include how to secure the environment and allow restricted access to other teams as well as avoiding common pitfalls and ensuring compliance standards.
Monitoring Uptime on the NeCTAR Research Cloud - Andy Botting, University of ...OpenStack
Audience Level
Intermediate
Synopsis
We will discuss how we do monitoring on the Nectar research cloud, utilising tools like OpenStack tempest, Nagios and translating this into a user facing dashboard.
Speaker Bio:
Andy is a DevOps engineer working at the University of Melbourne in the Core Services team for the Nectar Research Cloud.
Quick dive into the big data pool without drowning - Demi Ben-Ari @ PanoraysDemi Ben-Ari
Everybody wants to go on the “Big Data” hype cycle, “To do Scale”, to use the coolest tools in the market like Hadoop, Apache Spark, Apache Cassandra, etc.
But do they ask themselves is there really a reason for that?
In the talk we’ll make a brief overview to all of the technologies in the Big Data world nowadays and we’ll talk about the problems that really emerge when you’d like to enter the great world of Big Data handling.
Showing you the Hadoop ecosystem and Apache Spark and all of the distributed tools leading the market today, will give you all a notion of what will be the real costs entering that world.
Promise that I’ll share some stories from the trenches :)
(And about the “pool” thing...I don’t really know how to swim)
Most database products have their own auditing functionalities or plugins but they always involve overhead which means they end up having them turned off or with the bare minimum enabled.
In this workshop we will show how to get reliable logging for mysql and mongodb servers in a scalable and non intrusive way, its drawbacks and how we can build our own open source tools to achieve results similar to most commercial products.
Tools to sniff, process and act upon queries will be shared and we will show how simple is to set up and monitor a database environment so it can be replicated and grow horizontally. All the code needed will be published.
This is the speech Shen Li gave at GopherChina 2017.
TiDB is an open source distributed database. Inspired by the design of Google F1/Spanner, TiDB features in infinite horizontal scalability, strong consistency, and high availability. The goal of TiDB is to serve as a one-stop solution for data storage and analysis.
In this talk, we will mainly cover the following topics:
- What is TiDB
- TiDB Architecture
- SQL Layer Internal
- Golang in TiDB
- Next Step of TiDB
The Dark Side Of Go -- Go runtime related problems in TiDB in productionPingCAP
Ed Huang, CTO of PingCAP, talked at Go System Conference about dealing with the typical and profound issues related to Go’s runtime as your systems become more complex. Taking TiDB as an example, he demonstrated how these problems can be reproduced, located, and analyzed in production.
This is the speech Max Liu gave at Percona Live Open Source Database Conference 2016.
Max Liu: Co-founder and CEO, a hacker with a free soul
The slide covered the following topics:
- Why another database?
- What kind of database we want to build?
- How to design such a database, including the principles, the architecture, and design decisions?
- How to develop such a database, including the architecture and the core technologies for TiKV and TiDB?
- How to test the database to ensure the quality and stability?
Large-Scale Automated Storage on Kubernetes - Matt Schallert OSCON 2019Matt Schallert
Managing large stateful applications is tough.
Matt Schallert outlines how, using Kubernetes, Uber automated managing a challenging stateful workload—M3DB, its sharded, replicated, multizone time series database—and examines the operational challenges the company faced while scaling M3DB from a handful of clusters to over 40 clusters across multiple data centers and cloud providers, all while trying to create an environment-agnostic solution for open source users.
Matt then demonstrates methods of managing stateful workloads in a declarative manner to ease operational burden. You’ll see how M3DB’s declarative approach to cluster management can be extended to other workloads using its common set of open source libraries. This approach made orchestrating M3DB easier.
Along the way, Matt shares lessons learned that you can apply to a variety of stateful workloads across bare metal and cloud environments, regardless of whether it’s running under an orchestration system or managing instances directly. You’ll walk away with advice for managing stateful systems at scale and lessons to bear in mind when considering using an orchestration system for state management.
Order from chaos: automating monitoring configurationSensu Inc.
In a high-performance computing shop with over 3,000 nodes, Harvard FAS Research Computing can’t afford chaos around our monitoring checks! In this Sensu Summit 2019 talk, you'll hear from Harvard SRE Molly Duggan about how they’re using CI/CD pipelines and the Sensu Go API to ensure that all changes to their monitoring system are validated, reproducible, and version controlled.
NetflixOSS Meetup S3 E1, covering latest components in Distributed Databases, Telemetry systems, Big Data tools and more. Speakers from Netflix, IBM Watson, Pivotal and Nike Digital
An Introduction to Rearview - Time Series Based MonitoringVictorOps
Jeff Simpson, senior software engineer at VictorOps, delivered this presentation at the Frontrange Alerting & Monitoring meetup...along with an awesome live demo.
Presentation slides from DevConf.cz 2017
Challenges, take-aways and recommendations on scaling up OpenShift's logging and metrics stack.
Authors:
Ricardo Lourenço:
https://www.linkedin.com/in/ricardopereira4it/
Elvir Kuric
https://www.linkedin.com/in/elvirkuric/
A key feature when monitoring and debugging any Cloud infrastructure is to provide the ability to trace, track, and collate all the individual, discrete steps that compose an event. A typical resource action in OpenStack is often a combination of smaller tasks -- which given the distributed nature of OpenStack -- can fail at unpredictable points in the workflow. By collecting the appropriate events, operators can view all events within Ceilometer, filter on a failed action and trace back the history of related events to spot anomalies or errors. In this talk, we provide an overview of the recent enhancements made in Ceilometer to support the collection of event notifications from OpenStack services. We will describe: how events are processed, transformed and stored in Ceilometer; how you can derive metrics from events; and how it’s possible to track the events of a resource and analyse where errors occur.
Security Monitoring for big Infrastructures without a Million Dollar budgetJuan Berner
Nowadays in an increasingly more complex and dynamic network its not enough to be a regex ninja and storing only the logs you think you might need. From network traffic to custom logs you won't know which logs will be crucial to stop the next attacker, and if you are not planning to spend a half of your security budget in a commercial solution we will show you a way to building you own SIEM with open source. The talk will go from how to build a powerful logging environment for your organization to scaling on the cloud and storing everything forever. We will walk through how to build such a system with open source solutions as Elasticsearch and Hadoop, and creating your own custom monitoring rules to monitor everything you need. The talk will also include how to secure the environment and allow restricted access to other teams as well as avoiding common pitfalls and ensuring compliance standards.
Monitoring Uptime on the NeCTAR Research Cloud - Andy Botting, University of ...OpenStack
Audience Level
Intermediate
Synopsis
We will discuss how we do monitoring on the Nectar research cloud, utilising tools like OpenStack tempest, Nagios and translating this into a user facing dashboard.
Speaker Bio:
Andy is a DevOps engineer working at the University of Melbourne in the Core Services team for the Nectar Research Cloud.
Quick dive into the big data pool without drowning - Demi Ben-Ari @ PanoraysDemi Ben-Ari
Everybody wants to go on the “Big Data” hype cycle, “To do Scale”, to use the coolest tools in the market like Hadoop, Apache Spark, Apache Cassandra, etc.
But do they ask themselves is there really a reason for that?
In the talk we’ll make a brief overview to all of the technologies in the Big Data world nowadays and we’ll talk about the problems that really emerge when you’d like to enter the great world of Big Data handling.
Showing you the Hadoop ecosystem and Apache Spark and all of the distributed tools leading the market today, will give you all a notion of what will be the real costs entering that world.
Promise that I’ll share some stories from the trenches :)
(And about the “pool” thing...I don’t really know how to swim)
Most database products have their own auditing functionalities or plugins but they always involve overhead which means they end up having them turned off or with the bare minimum enabled.
In this workshop we will show how to get reliable logging for mysql and mongodb servers in a scalable and non intrusive way, its drawbacks and how we can build our own open source tools to achieve results similar to most commercial products.
Tools to sniff, process and act upon queries will be shared and we will show how simple is to set up and monitor a database environment so it can be replicated and grow horizontally. All the code needed will be published.
This is the speech Shen Li gave at GopherChina 2017.
TiDB is an open source distributed database. Inspired by the design of Google F1/Spanner, TiDB features in infinite horizontal scalability, strong consistency, and high availability. The goal of TiDB is to serve as a one-stop solution for data storage and analysis.
In this talk, we will mainly cover the following topics:
- What is TiDB
- TiDB Architecture
- SQL Layer Internal
- Golang in TiDB
- Next Step of TiDB
The Dark Side Of Go -- Go runtime related problems in TiDB in productionPingCAP
Ed Huang, CTO of PingCAP, talked at Go System Conference about dealing with the typical and profound issues related to Go’s runtime as your systems become more complex. Taking TiDB as an example, he demonstrated how these problems can be reproduced, located, and analyzed in production.
This is the speech Max Liu gave at Percona Live Open Source Database Conference 2016.
Max Liu: Co-founder and CEO, a hacker with a free soul
The slide covered the following topics:
- Why another database?
- What kind of database we want to build?
- How to design such a database, including the principles, the architecture, and design decisions?
- How to develop such a database, including the architecture and the core technologies for TiKV and TiDB?
- How to test the database to ensure the quality and stability?
Large-Scale Automated Storage on Kubernetes - Matt Schallert OSCON 2019Matt Schallert
Managing large stateful applications is tough.
Matt Schallert outlines how, using Kubernetes, Uber automated managing a challenging stateful workload—M3DB, its sharded, replicated, multizone time series database—and examines the operational challenges the company faced while scaling M3DB from a handful of clusters to over 40 clusters across multiple data centers and cloud providers, all while trying to create an environment-agnostic solution for open source users.
Matt then demonstrates methods of managing stateful workloads in a declarative manner to ease operational burden. You’ll see how M3DB’s declarative approach to cluster management can be extended to other workloads using its common set of open source libraries. This approach made orchestrating M3DB easier.
Along the way, Matt shares lessons learned that you can apply to a variety of stateful workloads across bare metal and cloud environments, regardless of whether it’s running under an orchestration system or managing instances directly. You’ll walk away with advice for managing stateful systems at scale and lessons to bear in mind when considering using an orchestration system for state management.
Order from chaos: automating monitoring configurationSensu Inc.
In a high-performance computing shop with over 3,000 nodes, Harvard FAS Research Computing can’t afford chaos around our monitoring checks! In this Sensu Summit 2019 talk, you'll hear from Harvard SRE Molly Duggan about how they’re using CI/CD pipelines and the Sensu Go API to ensure that all changes to their monitoring system are validated, reproducible, and version controlled.
NetflixOSS Meetup S3 E1, covering latest components in Distributed Databases, Telemetry systems, Big Data tools and more. Speakers from Netflix, IBM Watson, Pivotal and Nike Digital
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022StreamNative
Starting with version 2.10, the Apache ZooKeeper dependency has been eliminated and replaced with a pluggable framework that enables you to reduce the infrastructure footprint of Apache Pulsar by leveraging alternative metadata and coordination systems based on your deployment environment. In this talk, walk through the steps required to utilize the existing etcd service running inside Kubernetes to act as Pulsar's metadata store, thereby eliminating the need to run ZooKeeper entirely, leaving you with a Zookeeper-less Pulsar.
Scalable complex event processing on samza @UBERShuyi Chen
The Marketplace data team at Uber has built a scalable complex event processing platform to solve many challenging real time data needs for various Uber products. This platform has been in production for almost a year and it has proven to be very flexible to solve many use cases. In this talk, we will share in detail the design and architecture of the platform, and how we employ Samza, Kafka, and Siddhi at scale.
This slides was presented at Stream Processing Meetup @ LinkedIn on June 15 2016.
Orchestrating Cassandra with Kubernetes: Challenges and OpportunitiesRaghavendra Prabhu
This is a talk about orchestration of Cassandra with cassandra operator, kubernetes and Yelp PaaSTA (https://github.com/Yelp/paasta).
The talk was presented at Computer Laboratory, University of Cambridge as part of the Engineering, Science and Technology Event (https://www.careers.cam.ac.uk/recruiting/event2Tech.asp) in November 2019.
Orchestrating Cassandra with Kubernetes Operator and PaaSTARaghavendra Prabhu
Video URL: https://youtu.be/GjI6MUz7AyE
This is the slide deck of the Percona Live Online 2020 talk given by me in May 2020: https://www.percona.com/resources/videos/orchestrating-cassandra-kubernetes-operator-and-yelp-paasta-percona-live-online
The talk delves into the architecture of our Cassandra Kubernetes Operator and the multi-region multi-AZ clusters it manages, and strategies we have in place for safe rollouts and zero-downtime migration.
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...Hernan Costante
Nowadays in an increasingly more complex and dynamic network its not enough to be a regex ninja and storing only the logs you think you might need. From network traffic to custom logs you won't know which logs will be crucial to stop the next attacker, and if you are not planning to spend a half of your security budget in a commercial solution we will show you a way to building you own SIEM with open source. The talk will go from how to build a powerful logging environment for your organization to scaling on the cloud and storing everything forever. We will walk through how to build such a system with open source solutions as Elasticsearch and Hadoop, and creating your own custom monitoring rules to monitor everything you need. The talk will also include how to secure the environment and allow restricted access to other teams as well as avoiding common pitfalls and ensuring compliance standards.
Scala like distributed collections - dumping time-series data with apache sparkDemi Ben-Ari
Spark RDDs are almost identical to Scala collection, just in a distributed manner, all of the transformations and actions are derived from the Scala collections API.
As Martin Odersky mentioned, “Spark - The Ultimate Scala Collections” is the right way to look at RDDs. But with that great distributed power comes a great many data problems: at first you’ll start tackling the concept of partitioning, then the actual data becomes the next thing to worry about.
In the talk we’ll go through an overview on Spark's architecture, and see how similar RDDs are to the Scala collections API. We'll then shift to the world of problems that you’ll be facing when using Spark for processing a vast volume of time-series data with multiple data stores (S3, MongoDB, Apache Cassandra, MySQL).
When you start tackling many scale and performance problems, many questions arise:
> How to handle missing data?
> Should the system handle both serving and backend processes, or should we separate them out?
> Which solution is cheaper?
> How do we get the best performance for money spent?
In the talk we will tell the tale of all of the transformations we’ve made to our data and review the multiple data persistency layers... and I’ll try my best NOT to answer the question “which persistency layer is the best?” but I do promise to share our pains and lessons learned!
Linkedin has multiple data-centers hosting tens of thousands of servers across them. A large percentage of these servers host our data infrastructure - our distributed data store called Espresso is sizeable amongst them. The fleet of servers contain various hardware components including, but not limited to, SSDs; and hardware has a tendency of failing from time to time. In case of hardware failures the servers need to undergo maintenance which can take a significant amount of time based on type of failure. This creates reduced capacity for that duration and throws an interesting problem of maintaining capacity in the face of multiple failures. This talk covers how LinkedIn uses Camunda wrapped around with several components to achieve hands-off capacity management via multiple workflows, with asynchronous pauses and synchronisation among them. It will also highlight how we achieved seamless integrations with various platforms and components within Linkedin's Infrastructure, and a few best practices that helped us achieve the final state.
Reactive mistakes - ScalaDays Chicago 2017Petr Zapletal
Reactive applications are becoming a de-facto industry standard and, if employed correctly, toolkits like Lightbend Reactive Platform make the implementation easier than ever. But design of these systems might be challenging as it requires particular mindset shift to tackle problems we might not be used to. In this talk we’re going to discuss the most common things I’ve seen in the field that prevented applications to work as expected. I’d like to talk about typical pitfalls that might cause troubles, about trade-offs that might not be fully understood or important choices that might be overlooked including persistent actors pitfalls, tackling of network partitions, proper implementations of graceful shutdown or distributed transactions, trade-offs of micro-services or actors and more.
This talk should be interesting for anyone who is thinking about, implementing, or have already deployed reactive application. My goal is to provide a comprehensive explanation of common problems to be sure they won’t be repeated by fellow developers. The talk is a little bit more focused on Lightbend platform but understanding of the concepts we are going to talk about should be beneficial for everyone interested in this field.
OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...NETWAYS
At Uber we use high cardinality monitoring to observe and detect issues with our 4,000 microservices running on Mesos and across our infrastructure systems and servers. We’ll cover how we put the resulting 6 billion plus time series to work in a variety of different ways, auto-discovering services and their usage of other systems at Uber, setting up and tearing down alerts automatically for services, sending smart alert notifications that rollup different failures into individual high level contextual alerts, and more. We’ll also talk about how we accomplish all this with a global view of our systems with M3, our open source metrics platform. We’ll take a deep dive look at how we use M3DB, now available as an open source Prometheus long term storage backend, to horizontally scale our metrics platform in a cost efficient manner with a system that’s still sane to operate with petabytes of metrics data.
This talk was given at Cassandra London meetup: https://www.meetup.com/Cassandra-London/events/267271963/ . The talk is about orchestration of Cassandra with our Kubernetes Operator and Yelp PaaSTA. We also outline some of the opportunities and challenges associated with this architecture.
Youtube link: https://www.youtube.com/watch?v=JqAILFkkibA
This talk is about orchestration of Cassandra on Kubernetes with Cassandra Operator and Yelp's Platform-as-a-Service: PaaSTA. The talk focusses specifically on the internals of cassandra operator and its core reconcile loop for reconciliation of cluster state and on-disk configuration.
This is a talk about safe and high velocity automation on AWS (Amazon Web Services) with AWS Systems Manager, and is applicable for use cases such as reliability engineering and deployment automation.
Talk given on state of NUMA with Java databases such as Cassandra and how it can improved / ameliorated, and compared with traditional storage engines.
Clusternaut: Orchestrating Percona XtraDB Cluster with Kubernetes.Raghavendra Prabhu
The talk presented at MySQL & Friends devroom at FOSDEM 2016 in Brussels: https://fosdem.org/2016/schedule/event/clusternaut/
Devroom: https://fosdem.org/2016/schedule/track/mysql_and_friends/
Gone are those days when companies used to be strictly colocated in a single office. Distributed workplaces are gradually becoming the norm than an exception. So, it is essential that we talk more about it and discuss it.
So, this talk is essentially about:
a) Productivity and working from home.
b) Scheduling flexibility.
c) Challenges in communication and ways to overcome them.
d) Ways of getting such a job and Open Source.
e) Measuring work and micro-management
f) Feeling of detachment and workarounds for it.
To sum up, I will make this talk a very informative and entertaining one, as a lightning talk ought to be.
Securing databases with systemd for containers and services Raghavendra Prabhu
Data is the most valuable entity associated with a system, particularly when it is a sensitive one. Not only are there threats associated with physical access
to the box, but also ones where logical access suffices - sql injections etc.
Vulnerabilities like shellshock and heartbleed have also shown that an exploit in one component can also be used to access others through buffer overflows, memory overruns etc. and/or impact the immunity of system severely.
This is where "Principle of least privilege" comes into play. Wikipedia defines it as "a particular abstraction layer of a computing environment, every module (such as a process, a user or a program depending on the subject) must be able to access only the information and resources that are necessary for its legitimate purpose".
Dock'em: Distributed Systems Testing with NetEm and Docker Raghavendra Prabhu
This talk is about distributed systems testing of Galera with NetEm and Docker!
Video of the talk: https://www.youtube.com/watch?v=YBuuvhSO38s&list=PLctlsn9Gs8wbx47tuhxuNytdrsDf_LWI2&index=1
Playlist: https://www.youtube.com/playlist?list=PLctlsn9Gs8wbx47tuhxuNytdrsDf_LWI2
Galera with Docker: How Synchronous Replication and Linux Containers mesh tog...Raghavendra Prabhu
How Galera (Synchronous replication plugin for Percona XtraDB Cluster) can be used with Docker (or linux containers in general) to 'mesh' well.
Video of the talk: https://www.youtube.com/watch?v=3A8EF549Q3Y&list=PLctlsn9Gs8wbx47tuhxuNytdrsDf_LWI2&index=2
Playlist: http://www.youtube.com/playlist?list=PLctlsn9Gs8wbx47tuhxuNytdrsDf_LWI2
Jutsu or Dô: Open documentation: continuous process than a body Raghavendra Prabhu
This talk is about open source documentation and how it can be improved for the community!
Video: https://www.youtube.com/watch?v=sG6jORFwhEA&list=PLctlsn9Gs8wbx47tuhxuNytdrsDf_LWI2&index=3
Playlist: http://www.youtube.com/playlist?list=PLctlsn9Gs8wbx47tuhxuNytdrsDf_LWI2
Corpus collapsum: Partition tolerance of Galera in a noisy high load environmentRaghavendra Prabhu
This is the talk given at Highload++ 2014 in Moscow, Russia. The topic was partition tolerance testing of Galera in a noisy high load environment with NetEm and Docker.
Corpus collapsum: Partition tolerance of Galera put to testRaghavendra Prabhu
This is the talk given at RICON 2014 (ricon.io) on partition tolerance testing of Galera with docker and netem.
Video: https://www.youtube.com/watch?v=xRD6A8TY_Uw
Link to the talk: http://ricon.io/event-details/index.html#corpus-collapsum
Acidic clusters - Review of contemporary ACID-compliant databases with synchr...Raghavendra Prabhu
This talk reviews database clusters of our time which employ synchronous replication while being ACID compliant. ACID compliance implies ability to support transactions across nodes. As part of this talk, PXC (Percona XtraDB Cluster)/Galera, Google F1 based on Spanner/CFS and MySQL Cluster will be considered. Primary objective here is to expound features of
each in order to highlight differentiating factors and commonality between them.
Running virtualized Galera instances for fun and profitRaghavendra Prabhu
This is the talk given at linux conf au 2014, Perth in the sysadmin miniconf.
The talk is on how Galera instances can be used better when there is virtualization in place, as in today's OpenStack environments and such.
ACIDic Clusters: Review of current relation databases with synchronous replic...Raghavendra Prabhu
These are the slides from the talk given at Percona Live 2014 MySQL Conference and Expo (PLMCE): http://www.percona.com/live/mysql-conference-2014/sessions/acidic-clusters-review-current-relational-databases-synchronous-replication
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
Navigating the Metaverse: A Journey into Virtual Evolution"Donna Lenk
Join us for an exploration of the Metaverse's evolution, where innovation meets imagination. Discover new dimensions of virtual events, engage with thought-provoking discussions, and witness the transformative power of digital realms."
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxrickgrimesss22
Discover the essential features to incorporate in your Winzo clone app to boost business growth, enhance user engagement, and drive revenue. Learn how to create a compelling gaming experience that stands out in the competitive market.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
top nidhi software solution freedownloadvrstrong314
This presentation emphasizes the importance of data security and legal compliance for Nidhi companies in India. It highlights how online Nidhi software solutions, like Vector Nidhi Software, offer advanced features tailored to these needs. Key aspects include encryption, access controls, and audit trails to ensure data security. The software complies with regulatory guidelines from the MCA and RBI and adheres to Nidhi Rules, 2014. With customizable, user-friendly interfaces and real-time features, these Nidhi software solutions enhance efficiency, support growth, and provide exceptional member services. The presentation concludes with contact information for further inquiries.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
Traditional software testing methods are being challenged in retail, where customer expectations and technological advancements continually shape the landscape. Enter generative AI—a transformative subset of artificial intelligence technologies poised to revolutionize software testing.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Mind IT Systems
Healthcare providers often struggle with the complexities of chronic conditions and remote patient monitoring, as each patient requires personalized care and ongoing monitoring. Off-the-shelf solutions may not meet these diverse needs, leading to inefficiencies and gaps in care. It’s here, custom healthcare software offers a tailored solution, ensuring improved care and effectiveness.
In software engineering, the right architecture is essential for robust, scalable platforms. Wix has undergone a pivotal shift from event sourcing to a CRUD-based model for its microservices. This talk will chart the course of this pivotal journey.
Event sourcing, which records state changes as immutable events, provided robust auditing and "time travel" debugging for Wix Stores' microservices. Despite its benefits, the complexity it introduced in state management slowed development. Wix responded by adopting a simpler, unified CRUD model. This talk will explore the challenges of event sourcing and the advantages of Wix's new "CRUD on steroids" approach, which streamlines API integration and domain event management while preserving data integrity and system resilience.
Participants will gain valuable insights into Wix's strategies for ensuring atomicity in database updates and event production, as well as caching, materialization, and performance optimization techniques within a distributed system.
Join us to discover how Wix has mastered the art of balancing simplicity and extensibility, and learn how the re-adoption of the modest CRUD has turbocharged their development velocity, resilience, and scalability in a high-growth environment.
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteGoogle
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-pilot-review/
AI Pilot Review: Key Features
✅Deploy AI expert bots in Any Niche With Just A Click
✅With one keyword, generate complete funnels, websites, landing pages, and more.
✅More than 85 AI features are included in the AI pilot.
✅No setup or configuration; use your voice (like Siri) to do whatever you want.
✅You Can Use AI Pilot To Create your version of AI Pilot And Charge People For It…
✅ZERO Manual Work With AI Pilot. Never write, Design, Or Code Again.
✅ZERO Limits On Features Or Usages
✅Use Our AI-powered Traffic To Get Hundreds Of Customers
✅No Complicated Setup: Get Up And Running In 2 Minutes
✅99.99% Up-Time Guaranteed
✅30 Days Money-Back Guarantee
✅ZERO Upfront Cost
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
3. Some numbers!
● 30 million unique mobile app users.
● 74 million UMVs via mobile web.
● 84 million UMVs via desktop.
● More than 142 million rich, local reviews.
● 78% of all searches on Yelp came from mobile.
● 9 offices around the world.
● 4,000+ employees worldwide and 400+ engineers
(as of Q3 2017)
6. ….
● Memcached
● Redis
● Spark
● Redshift
● DynamoDB
● S3
● OSCON talk on data tiers at Yelp
And many more!
6
7. Distributed Systems Team
● Several TB in prod cassandra clusters with tens of
nodes in each.
● Half a million messages/second in our streaming
pipeline
● Several TB in elasticsearch in prod with several hundred
nodes
● All are multi-AZ multi-region
● And growing…
10. ● Safe
● Generic and Extensible
● Distributed
● Loosely coupled
● Not ad-hoc
○ Reviewed
● Sound config management
The Why I
11. ● Schedulable
● Reusable
● Cluster awareness
● Easily maintainable and observable
○ Not a black box.
○ More Ironman, less Ultron
● Prior Art
○ Downsides
The Why II
12. ● Paramount*
● Serialized execution
○ ‘m’ out of ‘n’
○ Disjoint jobs.
● Avoid cascade
● Privilege escalation
● Push-based
* Unless oncall is automated too.
Safety
13. Quotes
“There are only two hard problems in distributed systems:
2. Exactly-once delivery 1. Guaranteed order of messages
2. Exactly-once delivery”
@mathiasverraes
“There are 2 hard problems in computer science: cache
invalidation, naming things, and off-by-1 errors.”
@secretGeek
14. ● Network is reliable
● Latency is zero
● Bandwidth is infinite
● Network is secure
● One administrator
● Transport cost is zero
● Network is homogenous
● Topology doesn't change
Fallacies of Distributed System
23. ● The executor of Taskerman
● Dequeue task and executes
○ Pre-defined reviewed code.
● Scheduled on node
● Zookeeper for coordination
● Task deleted upon success
● Dead letter queue upon failed
retries
Taskrunner
26. ● Atomic Counters
○ Statistics on actions
○ Circuit breakers
■ Dead man’s switch
■ Prevent failure cascade
○ Automatic reset
● What is Atomic
○ Serializability
Zookeeper
27. ● Staleness
○ Nodes can go down
● Garbage collection
○ Cleanup of ZK data structures
● Composition
● Starvation
● Uptime
Zookeeper: Challenges
28. ● Failure is the norm, not an
exception
● Multiple vectors of failure
● Pessimistic approach
○ Job retry
○ Job Counter
● Mitigation vs Alerting
Failure
29. ● Heartbeat ping
○ End-to-end monitoring
● Dead Letter Queue
○ Recycle bin of failed tasks.
○ Hooks into human side of
monitoring
● Others
○ Separation of state
○ Mutability
Failure handling