This document summarizes how Booking.com solved scalability issues with their Event Graphite Processor (EGP) system. The EGP processes large volumes of event data to generate metrics but was limited by high RAM usage. A new approach was developed that uses event streaming and parallelization, reducing processing time from over 120 seconds to 80 seconds while using much less RAM. This was achieved through a hackathon that rewrote 260 monitors in one day. The new system uses 56-core servers, processes events in parallel groups, and requires only 500MB of RAM compared to the previous 15GB.
Three engineers, at various points, each take their own approach adding Rust to a C codebase, each being more and more ambitious. I initially just wanted to replace the server’s networking and event loop with an equally fast Rust implementation. We’d reuse many core components that were in C and just call into them from Rust. Surely it wouldn’t be that much code…
Pelikan is Twitter’s open source and modular framework for in-memory caching, allowing us to replace Memcached and Redis forks with a single codebase and achieve better performance. At Twitter, we operate hundreds of cache clusters storing hundreds of terabytes of small objects in memory. In-memory caching is critical, and demands performance, reliability, and efficiency.
In this talk, I’ll share my adventures in working on Pelikan and how rewriting it in Rust can be more than just a meme.
Stress Testing at Twitter: a tale of New Year EvesHerval Freire
Failure testing is a fundamental piece of Twitter’s reliability engineering. Over the years, we developed a rich toolchain that allows us to detect and fix scalability problems long before they happen. In this talk, we’ll cover some of the strategies we employ and discuss our always evolving approach to API stress testing and its “unit test” equivalent, redline testing.
In file systems, large sequential writes are more beneficial than small random writes, and hence many storage systems implement a log structured file system. In the same way, the cloud favors large objects more than small objects. Cloud providers place throttling limits on PUTs and GETs, and so it takes significantly longer time to upload a bunch of small objects than a large object of the aggregate size. Moreover, there are per-PUT calls associated with uploading smaller objects.
In Netflix, a lot of media assets and their relevant metadata is generated and pushed to cloud.
We would like to propose a strategy to compact these small objects into larger blobs before uploading them to Cloud. We will discuss how to select relevant smaller objects, and manage the indexing of these objects within the blob along with modification in reads, overwrites and deletes.
Finally, we would showcase the potential impact of such a strategy on Netflix assets in terms of cost and performance.
Three engineers, at various points, each take their own approach adding Rust to a C codebase, each being more and more ambitious. I initially just wanted to replace the server’s networking and event loop with an equally fast Rust implementation. We’d reuse many core components that were in C and just call into them from Rust. Surely it wouldn’t be that much code…
Pelikan is Twitter’s open source and modular framework for in-memory caching, allowing us to replace Memcached and Redis forks with a single codebase and achieve better performance. At Twitter, we operate hundreds of cache clusters storing hundreds of terabytes of small objects in memory. In-memory caching is critical, and demands performance, reliability, and efficiency.
In this talk, I’ll share my adventures in working on Pelikan and how rewriting it in Rust can be more than just a meme.
Stress Testing at Twitter: a tale of New Year EvesHerval Freire
Failure testing is a fundamental piece of Twitter’s reliability engineering. Over the years, we developed a rich toolchain that allows us to detect and fix scalability problems long before they happen. In this talk, we’ll cover some of the strategies we employ and discuss our always evolving approach to API stress testing and its “unit test” equivalent, redline testing.
In file systems, large sequential writes are more beneficial than small random writes, and hence many storage systems implement a log structured file system. In the same way, the cloud favors large objects more than small objects. Cloud providers place throttling limits on PUTs and GETs, and so it takes significantly longer time to upload a bunch of small objects than a large object of the aggregate size. Moreover, there are per-PUT calls associated with uploading smaller objects.
In Netflix, a lot of media assets and their relevant metadata is generated and pushed to cloud.
We would like to propose a strategy to compact these small objects into larger blobs before uploading them to Cloud. We will discuss how to select relevant smaller objects, and manage the indexing of these objects within the blob along with modification in reads, overwrites and deletes.
Finally, we would showcase the potential impact of such a strategy on Netflix assets in terms of cost and performance.
Using eBPF to Measure the k8s Cluster HealthScyllaDB
As a k8s cluster-admin your app teams have a certain expectation of your cluster to be available to deploy services at any time without problems. While there is no shortage on metrics in k8s its important to have the right metrics to alert on issues and giving you enough data to react to potential availability issues. Prometheus has become a standard and sheds light on the inner behaviour of Kubernetes clusters and workloads. Lots of KPIs (CPU, IO, network. Etc) in our On-Premise environment are less precise when we start to work in a Cloud environment. Ebpf is the perfect technology that fulfills that requirement as it gives us information down to the kernel level. In 2018 Cloudflare shared an opensource project to expose custom ebpf metrics in Prometheus. Join this session and learn about: • What is ebpf? • What type of metrics we can collect? • How to expose those metrics in a K8s environment. This session will try to deliver a step-by-step guide on how to take advantage of the ebpf exporter.
Cortex: Prometheus as a Service, One Year OnKausal
Presented by Tom Wilkie at PromCon 2017
With Speaker Notes: https://goo.gl/V9qGva
At PromCon 2016, I presented "Project Frankenstein: A multitenant, horizontally scalable Prometheus as a service". It's now one year later, and lots has changed - not least the name! This talk will discuss what we've learnt running a Prometheus service for the past year, the architectural changes we made from the original design, and the improvements we've made to the Cortex user experience.
OSNoise Tracer: Who Is Stealing My CPU Time?ScyllaDB
In the context of high-performance computing (HPC), the Operating System Noise (osnoise) refers to the interference experienced by an application due to activities inside the operating system. In the context of Linux, NMIs, IRQs, softirqs, and any other system thread can cause noise to the application. Moreover, hardware-related jobs can also cause noise, for example, via SMIs.
HPC users and developers that care about every microsecond stolen by the OS need not only a precise way to measure the osnoise but mainly to figure out who is stealing cpu time so that they can pursue the perfect tune of the system. These users and developers are the inspiration of Linux's osnoise tracer.
The osnoise tracer runs an in-kernel loop measuring how much time is available. It does it with preemption, softirq and IRQs enabled, thus allowing all the sources of osnoise during its execution. The osnoise tracer takes note of the entry and exit point of any source of interferences. When the noise happens without any interference from the operating system level, the tracer can safely point to a hardware-related noise. In this way, osnoise can account for any source of interference. The osnoise tracer also adds new kernel tracepoints that auxiliaries the user to point to the culprits of the noise in a precise and intuitive way.
At the end of a period, the osnoise tracer prints the sum of all noise, the max single noise, the percentage of CPU available for the thread, and the counters for the noise sources, serving as a benchmark tool.
High-Performance Networking Using eBPF, XDP, and io_uringScyllaDB
In the networking world there are a number of ways to increase performance over naive use of basic Berkeley sockets. These techniques have ranged from polling blocking sockets, non-blocking sockets controlled by Epoll, all the way through completely bypassing the Linux kernel for maximum network performance where you talk directly to the network interface card by using something like DPDK or Netmap. All these tools have their place, and generally occupy a space from convenience to performance. But in recent years, that landscape has changed massively.. The tools available to the average Linux systems developer have improved from the creation of io_uring, to the expansion of bpf from a simple filtering language to a full-on programming environment embedded directly in the kernel. Along with that came something called XDP (express datapath). This was Linux kernel's answer to kernel-bypass networking. AF_XDP is the new socket type created by this feature, and generally works very similarly to something like DPDK. History lessons out of the way, this talk will look into, and discuss the merits of this technology, it's place in the broader ecosystem and how it can be used to attain the highest level of performance possible. This talk will dive into crucial details, such as how AF_XDP works, how it can be integrated into a larger system and finally more advanced topics such as request sharding/load balancing. There will be detailed look at the design of AF_XDP, the eBpf code used, as well as the userspace code required to drive it all. It will also include performance numbers from this setup compared to regular kernel networking. And most importantly how to put all this together to handle as much data as possible on a single modern multi-core system.
Rust promises developers the execution speed of non-managed languages like C++, with the safety guarantees of managed languages like Go. Its fast rise in popularity shows this promise has been largely upheld.
However, the situation is a bit muddier for the newer asynchronous extensions. This talk will explore some of the pitfalls that users may face while developing asynchronous Rust applications that have direct consequences in their ability to hit that sweet low p99. We will see how the Glommio asynchronous executor tries to deal with some of those problems, and what the future holds.
Containers are the future for all microservice based apps. Where do you deploy them? How do you manage them? At Digital Ocean we went through growing pains of trying out 5 of the top major Docker container schedulers, Mesos, Kubernetes, Docker Swarm, Nomad and we even tried manual scheduling of containers. Let us walk you through how we chose different schedulers for different applications, and tips and tricks for choosing a scheduler to use.
Golang Performance : microbenchmarks, profilers, and a war storyAerospike
Slides for Brian Bulkowski's talk about Golang performance:
microbenchmarks, profilers, and a war story about optimizing the Aerospike Database Go client.
http://www.meetup.com/Go-lang-Developers-NYC/events/216650022/
Rapid Application Design in Financial ServicesAerospike
Applying internet NoSQL design patterns to fraud detection and risk scoring, including when to use SQL and when to use NoSQL. The state of NAND Flash and NVMe is also discussed, as well as storage class memory futures with Intel's 3D Xpoint technology.
This talk was presented in LA at the following meetup:
http://www.meetup.com/scalela/events/233396111/
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Data Con LA
This talk explores deploying a series of small and large batch and streaming pipelines locally, to Spark and Flink clusters and to Google Cloud Dataflow services to give the audience a feel for the portability of Beam, a new portable Big Data processing framework recently submitted by Google to the Apache foundation. This talk will look at how the programming model handles late arriving data in a stream with event time, windows, and triggers.
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingDoiT International
Dataflow is a unified programming model and a managed service for developing and executing a wide range of data processing patterns including ETL, batch computation, and continuous computation. Cloud Dataflow frees you from operational tasks like resource management and performance optimization.
Using eBPF to Measure the k8s Cluster HealthScyllaDB
As a k8s cluster-admin your app teams have a certain expectation of your cluster to be available to deploy services at any time without problems. While there is no shortage on metrics in k8s its important to have the right metrics to alert on issues and giving you enough data to react to potential availability issues. Prometheus has become a standard and sheds light on the inner behaviour of Kubernetes clusters and workloads. Lots of KPIs (CPU, IO, network. Etc) in our On-Premise environment are less precise when we start to work in a Cloud environment. Ebpf is the perfect technology that fulfills that requirement as it gives us information down to the kernel level. In 2018 Cloudflare shared an opensource project to expose custom ebpf metrics in Prometheus. Join this session and learn about: • What is ebpf? • What type of metrics we can collect? • How to expose those metrics in a K8s environment. This session will try to deliver a step-by-step guide on how to take advantage of the ebpf exporter.
Cortex: Prometheus as a Service, One Year OnKausal
Presented by Tom Wilkie at PromCon 2017
With Speaker Notes: https://goo.gl/V9qGva
At PromCon 2016, I presented "Project Frankenstein: A multitenant, horizontally scalable Prometheus as a service". It's now one year later, and lots has changed - not least the name! This talk will discuss what we've learnt running a Prometheus service for the past year, the architectural changes we made from the original design, and the improvements we've made to the Cortex user experience.
OSNoise Tracer: Who Is Stealing My CPU Time?ScyllaDB
In the context of high-performance computing (HPC), the Operating System Noise (osnoise) refers to the interference experienced by an application due to activities inside the operating system. In the context of Linux, NMIs, IRQs, softirqs, and any other system thread can cause noise to the application. Moreover, hardware-related jobs can also cause noise, for example, via SMIs.
HPC users and developers that care about every microsecond stolen by the OS need not only a precise way to measure the osnoise but mainly to figure out who is stealing cpu time so that they can pursue the perfect tune of the system. These users and developers are the inspiration of Linux's osnoise tracer.
The osnoise tracer runs an in-kernel loop measuring how much time is available. It does it with preemption, softirq and IRQs enabled, thus allowing all the sources of osnoise during its execution. The osnoise tracer takes note of the entry and exit point of any source of interferences. When the noise happens without any interference from the operating system level, the tracer can safely point to a hardware-related noise. In this way, osnoise can account for any source of interference. The osnoise tracer also adds new kernel tracepoints that auxiliaries the user to point to the culprits of the noise in a precise and intuitive way.
At the end of a period, the osnoise tracer prints the sum of all noise, the max single noise, the percentage of CPU available for the thread, and the counters for the noise sources, serving as a benchmark tool.
High-Performance Networking Using eBPF, XDP, and io_uringScyllaDB
In the networking world there are a number of ways to increase performance over naive use of basic Berkeley sockets. These techniques have ranged from polling blocking sockets, non-blocking sockets controlled by Epoll, all the way through completely bypassing the Linux kernel for maximum network performance where you talk directly to the network interface card by using something like DPDK or Netmap. All these tools have their place, and generally occupy a space from convenience to performance. But in recent years, that landscape has changed massively.. The tools available to the average Linux systems developer have improved from the creation of io_uring, to the expansion of bpf from a simple filtering language to a full-on programming environment embedded directly in the kernel. Along with that came something called XDP (express datapath). This was Linux kernel's answer to kernel-bypass networking. AF_XDP is the new socket type created by this feature, and generally works very similarly to something like DPDK. History lessons out of the way, this talk will look into, and discuss the merits of this technology, it's place in the broader ecosystem and how it can be used to attain the highest level of performance possible. This talk will dive into crucial details, such as how AF_XDP works, how it can be integrated into a larger system and finally more advanced topics such as request sharding/load balancing. There will be detailed look at the design of AF_XDP, the eBpf code used, as well as the userspace code required to drive it all. It will also include performance numbers from this setup compared to regular kernel networking. And most importantly how to put all this together to handle as much data as possible on a single modern multi-core system.
Rust promises developers the execution speed of non-managed languages like C++, with the safety guarantees of managed languages like Go. Its fast rise in popularity shows this promise has been largely upheld.
However, the situation is a bit muddier for the newer asynchronous extensions. This talk will explore some of the pitfalls that users may face while developing asynchronous Rust applications that have direct consequences in their ability to hit that sweet low p99. We will see how the Glommio asynchronous executor tries to deal with some of those problems, and what the future holds.
Containers are the future for all microservice based apps. Where do you deploy them? How do you manage them? At Digital Ocean we went through growing pains of trying out 5 of the top major Docker container schedulers, Mesos, Kubernetes, Docker Swarm, Nomad and we even tried manual scheduling of containers. Let us walk you through how we chose different schedulers for different applications, and tips and tricks for choosing a scheduler to use.
Golang Performance : microbenchmarks, profilers, and a war storyAerospike
Slides for Brian Bulkowski's talk about Golang performance:
microbenchmarks, profilers, and a war story about optimizing the Aerospike Database Go client.
http://www.meetup.com/Go-lang-Developers-NYC/events/216650022/
Rapid Application Design in Financial ServicesAerospike
Applying internet NoSQL design patterns to fraud detection and risk scoring, including when to use SQL and when to use NoSQL. The state of NAND Flash and NVMe is also discussed, as well as storage class memory futures with Intel's 3D Xpoint technology.
This talk was presented in LA at the following meetup:
http://www.meetup.com/scalela/events/233396111/
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...Data Con LA
This talk explores deploying a series of small and large batch and streaming pipelines locally, to Spark and Flink clusters and to Google Cloud Dataflow services to give the audience a feel for the portability of Beam, a new portable Big Data processing framework recently submitted by Google to the Apache foundation. This talk will look at how the programming model handles late arriving data in a stream with event time, windows, and triggers.
Cloud Dataflow - A Unified Model for Batch and Streaming Data ProcessingDoiT International
Dataflow is a unified programming model and a managed service for developing and executing a wide range of data processing patterns including ETL, batch computation, and continuous computation. Cloud Dataflow frees you from operational tasks like resource management and performance optimization.
A presentation about the deployment of an ELK stack at bol.com
At bol.com we use Elasticsearch, Logstash and Kibana in a logsearch system that allows our developers and operations people to easilly access and search thru logevents coming from all layers of its infrastructure.
The presentations explains the initial design and its failures. It continues with explaining the latest design (mid 2014). Its improvements. And finally a set of tips are giving regarding Logstash and Elasticsearch scaling.
These slides were first presented at the Elasticsearch NL meetup on September 22nd 2014 at the Utrecht bol.com HQ.
Strata Singapore: GearpumpReal time DAG-Processing with Akka at ScaleSean Zhong
Gearpump is a Akka based realtime streaming engine, it use Actor to model everything. It has super performance and flexibility. It has performance of 18000000 messages/second and latency of 8ms on a cluster of 4 machines.
Using Riak for Events storage and analysis at Booking.comDamien Krotkine
At Booking.com, we have a constant flow of events coming from various applications and internal subsystems. This critical data needs to be stored for real-time, medium and long term analysis. Events are schema-less, making it difficult to use standard analysis tools.This presentation will explain how we built a storage and analysis solution based on Riak. The talk will cover: data aggregation and serialization, Riak configuration, solutions for lowering the network usage, and finally, how Riak's advanced features are used to perform real-time data crunching on the cluster nodes.
JCON Online 2021, International Java Community Conference, 07.10.21, Moritz Kammerer (@Moritz Kammerer, Expert Software Engineer at QAware).
== Please download slides in case they are blurred! ===
In his talk we have had a look at how Microservices can be developed with Micronaut. In our slides you can find out if it kept its promise.
Scala like distributed collections - dumping time-series data with apache sparkDemi Ben-Ari
Spark RDDs are almost identical to Scala collection, just in a distributed manner, all of the transformations and actions are derived from the Scala collections API.
As Martin Odersky mentioned, “Spark - The Ultimate Scala Collections” is the right way to look at RDDs. But with that great distributed power comes a great many data problems: at first you’ll start tackling the concept of partitioning, then the actual data becomes the next thing to worry about.
In the talk we’ll go through an overview on Spark's architecture, and see how similar RDDs are to the Scala collections API. We'll then shift to the world of problems that you’ll be facing when using Spark for processing a vast volume of time-series data with multiple data stores (S3, MongoDB, Apache Cassandra, MySQL).
When you start tackling many scale and performance problems, many questions arise:
> How to handle missing data?
> Should the system handle both serving and backend processes, or should we separate them out?
> Which solution is cheaper?
> How do we get the best performance for money spent?
In the talk we will tell the tale of all of the transformations we’ve made to our data and review the multiple data persistency layers... and I’ll try my best NOT to answer the question “which persistency layer is the best?” but I do promise to share our pains and lessons learned!
Apache Flink(tm) - A Next-Generation Stream ProcessorAljoscha Krettek
In diesem Vortrag wird es zunächst einen kurzen Überblick über den aktuellen Stand im Bereich der Streaming-Datenanalyse geben. Danach wird es mit einer kleinen Einführung in das Apache-Flink-System zur Echtzeit-Datenanalyse weitergehen, bevor wir tiefer in einige der interessanten Eigenschaften eintauchen werden, die Flink von den anderen Spielern in diesem Bereich unterscheidet. Dazu werden wir beispielhafte Anwendungsfälle betrachten, die entweder direkt von Nutzern stammen oder auf unserer Erfahrung mit Nutzern basieren. Spezielle Eigenschaften, die wir betrachten werden, sind beispielsweise die Unterstützung für die Zerlegung von Events in einzelnen Sessions basierend auf der Zeit, zu der ein Ereignis passierte (event-time), Bestimmung von Zeitpunkten zum jeweiligen Speichern des Zustands eines Streaming-Programms für spätere Neustarts, die effiziente Abwicklung bei sehr großen zustandsorientierten Streaming-Berechnungen und die Zugänglichkeit des Zustandes von außerhalb.
How does the Cloud Foundry Diego Project Run at Scale?VMware Tanzu
From Pivotal's Amit Gupta on July 9, 2015, a look at how the Cloud Foundry Diego project runs at scale, and what it took to get there. Offering a look into the Diego project scheduler and the performance testing efforts, all the tools necessary to ensure that Cloud Foundry can scale quickly and effortlessly.
To learn more, visit pivotal.io/platform-as-a-service/pivotal-cloud-foundry
How does the Cloud Foundry Diego Project Run at Scale, and Updates on .NET Su...Amit Gupta
The Cloud Foundry Diego team at Pivotal has been hard at work for the past few months exploring and improving Diego's performance at scale and under stress. This talk covers the goals, tools, and results of the experiments to date, as well as a glimpse of what's next.
And finally, a brief teaser about the current state of .NET support in Diego
S3 cassandra or outer space? dumping time series data using sparkDemi Ben-Ari
Vast volume of our processed data is Time Series data and once you start working with distributed systems, you start tackling many scale and performance problems, many questions arise:
How to handle missing data?
Should my system handle both serving and backed process or separating them out?
Which one of the solutions will be cheaper? Best Performance for Money?
In the talk we will tell the tale of all of the transformations we’ve made to our data model @Windward, show some of the problems we’ve handled, review the multiple data persistency layers like: S3, MongoDB, Apache Cassandra, MySQL.
And I’ll try my best NOT to answer the question “Which one of them is the Best?”
Sharing our Pain and Lessons learned is promised!
Bio:
Demi Ben-Ari, Sr. Data Engineer @Windward,
I have over 9 years of experience in building various systems both from the field of near real time applications and Big Data distributed systems.
Co-Founder of the “Big Things” Big Data community: http://somebigthings.com/big-things-intro/
I’m a software development groupie, Interested in tackling cutting edge technologies.
Java Day 2021, WeAreDevelopers, 2021-09-01, online: Moritz Kammerer (@Moritz Kammerer, Expert Software Engineer at QAware).
== Please download slides in case they are blurred! ===
In this talk, we took a look at how Microservices can be developed with Micronaut. Have a look if it has kept its promises.
Get Lower Latency and Higher Throughput for Java ApplicationsScyllaDB
Getting the best performance out of your Java applications can often be a challenge due to the managed environment nature of the Java Virtual Machine and the non-deterministic behaviour that this introduces. Automatic garbage collection (GC) can seriously affect the ability to hit SLAs for the 99th percentile and above.
This session will start by looking at what we mean by speed and how the JVM, whilst extremely powerful, means we don’t always get the performance characteristics we want. We’ll then move on to discuss some critical features and tools that address these issues, i.e. garbage collection, JIT compilers, etc. At the end of the session, attendees will have a clear understanding of the challenges and solutions for low-latency Java.
Dataflow - A Unified Model for Batch and Streaming Data ProcessingDoiT International
Batch and Streaming Data Processing and Vizualize 300Tb in 5 Seconds meetup on April 18th, 2016 (http://www.meetup.com/Big-things-are-happening-here/events/229532500)
YOW2018 Cloud Performance Root Cause Analysis at NetflixBrendan Gregg
Keynote by Brendan Gregg for YOW! 2018. Video: https://www.youtube.com/watch?v=03EC8uA30Pw . Description: "At Netflix, improving the performance of our cloud means happier customers and lower costs, and involves root cause
analysis of applications, runtimes, operating systems, and hypervisors, in an environment of 150k cloud instances
that undergo numerous production changes each week. Apart from the developers who regularly optimize their own code
, we also have a dedicated performance team to help with any issue across the cloud, and to build tooling to aid in
this analysis. In this session we will summarize the Netflix environment, procedures, and tools we use and build t
o do root cause analysis on cloud performance issues. The analysis performed may be cloud-wide, using self-service
GUIs such as our open source Atlas tool, or focused on individual instances, and use our open source Vector tool, f
lame graphs, Java debuggers, and tooling that uses Linux perf, ftrace, and bcc/eBPF. You can use these open source
tools in the same way to find performance wins in your own environment."
Stephan Ewen - Experiences running Flink at Very Large ScaleVerverica
This talk shares experiences from deploying and tuning Flink steam processing applications for very large scale. We share lessons learned from users, contributors, and our own experiments about running demanding streaming jobs at scale. The talk will explain what aspects currently render a job as particularly demanding, show how to configure and tune a large scale Flink job, and outline what the Flink community is working on to make the out-of-the-box for experience as smooth as possible. We will, for example, dive into - analyzing and tuning checkpointing - selecting and configuring state backends - understanding common bottlenecks - understanding and configuring network parameters
This is an unplanned lightning talk at Kubernetes Community Days 2019. It's about how Booking.com does multi-cluster blue-gree/canary deployments in its Kubernetes infrastructure.
Обратная сторона сервис-ориентированной архитектурыIvan Kruglov
В интернете можно найти множество статей, связанных с переходом на сервис-ориентированную архитектуру (СОА) или ее частный случай — микросервисную архитектуру. Все они подробно говорят про преимущества подобного перехода: разнесение большого монолитного кода слабой связанности, независимый и быстрый деплой и другие аспекты. Однако статей, которые подробно описывают цену такого перехода, гораздо меньше. В своем докладе я хочу сфокусироваться на такой цене, или обратной стороне медали. А именно на том факте, что переход на СОА — это фундаментальный сдвиг для компании в таких сферах как: инфраструктура, экспертиза эксплуатации, коммуникации между сервисами и людьми, контракты, ментальность, владение, орг. структура и другие моменты.
Последние два года Booking.com решает задачу ускорения выхода новых продуктов на рынок. Частью нового подхода является построение внутреннего облака. Основу этого облака составляют 15 Kubernetes-кластеров. При их создании Booking.com отошел от общепринятых методов: здесь сетап включает shared-кластера, плоскую сеть, хитрым образом считаемые SLO/SLI для каждого кластера, набор тестов, которые в реальном времени проверяют функционал кластера и его интеграций.
Booking.com не просто эксплуатирует Kubernetes, но еще и активно разрабатывает системные приложения, призванные улучшить экосистему. Например, был создан и выложен в опенсорс Shipper – набор контроллеров, обеспечивающих Kubernetes-native оркестрацию Сanary и Blue-Green деплойментов на несколько кластеров одновременно.
Тернии контейнеризованных приложений и микросервисовIvan Kruglov
За последние два с половиной года Booking.com прошел через три поколения приватных облаков. Первое было построено на Mesos и Marathon. В активной фазе оно просуществовало около полугода. Решили отказаться. Второе - на OpenShift. Работали над ним около года и тоже отказались. Сейчас у нас третье поколение на чистом Kubernetes. Пока живем с ним.
В своем докладе я хочу пройтись по каждому из этапов и рассказать причины внесенных изменений. Также будет интересно посмотреть на то, как внедрение контейнеризованных приложений и сервис-ориентированной архитектуры заставило нас перестраивать внутренние процессы: начиная от выдачи грантов на БД и заканчивая внедрением service mesh. Нам пришлось перекраивать многие элементы инфраструктуры, и то, что стартовало как небольшой проект, в итоге переросло во что-то намного большее.
Introducing envoy-based service mesh at Booking.comIvan Kruglov
Service mesh is a dedicated layer of a company's infrastructure which should simplify communication between services and make it reliable, secure and observable.
In this talk, we'll go deep into Booking.com's case study of introducing service mesh. We will discover the reasons and objectives of the project. Why Envoy was selected as the base rather than other available options. Find out what is the setup and features of the homegrown control plane. We will expand on what service is provided for developers and how they safely deploy potentially dangerous configuration changes. Finally, we will talk about pitfalls met along the way.
Service mesh - это выделенный слой в инфраструктуре компании, который призван упростить взаимодействие между сервисами, а также сделать его надежным и безопасным. В юрисдикцию service mesh, по разным мнениям, входят: маршрутизация запросов, service discovery, балансировка, обработка ошибок, мониторинг, трейсинг, авторизация и аутентификация и др. вещи. Реализация тоже варьируется от размазывания функционала по всему стеку до концентрации большей части его в одной точке. В своем докладе я бы хотел затронуть тему построения service mesh на примере Booking.com в контексте перехода от монолитной к сервис-ориентированной архитектуре. Мы погрузимся в детали некоторых компонентов и рассмотрим примеры решений удачных и не очень. Также затронем опыт внедрения и эксплуатации L7 proxy envoy и linkerd.
Service mesh - это выделенный слой в инфраструктуре компании, который призван упростить взаимодействие между сервисами, а также сделать его надежным и безопасным. В юрисдикцию service mesh, по разным мнениям, входят: маршрутизация запросов, service discovery, балансировка, обработка ошибок, мониторинг, трейсинг, авторизация и аутентификация и др. вещи. Реализация тоже варьируется от размазывания функционала по всему стеку до концентрации большей части его в одной точке.
В своем докладе я бы хотел затронуть тему построения service mesh на примере Booking.com в контексте перехода от монолитной к сервис-ориентированной архитектуре. Мы погрузимся в детали некоторых компонентов и рассмотрим примеры решений удачных и не очень. Также затронем начальный опыт внедрения и эксплуатации L7 proxy envoy и linkerd.
SOA: послать запрос на сервер? Что может быть проще?!Ivan Kruglov
Микросервисы - это круто, модно и интересно. Переход на их использование принесет команде заметные преимущества. Но сервис-ориентированная архитектура (SOA) не лишена недостатков. Один их них - это то, что, заменяя простой вызов функции на RPC, мы в неявном виде вводим в уравнение, отвечающее за стабильность системы, целую плеяду новых неизвестных. Например, простейший HTTP-запрос за время своей жизни проходит через множество всевозможных буферов, очередей и алгоритмов на своем пути от клиента к серверу и обратно. Совокупное поведение этих составляющих трудно предсказать, понять и правильно интерпретировать. И особенно трудно это сделать в нестандартных ситуациях.
В своем докладе я хочу поделиться опытом решения проблем, с которыми я столкнулся за время работы в Booking.com. Я расскажу, как небольшой тюнинг сервера и клиента существенно влиял на конечную стабильность системы.
Booking.com - популярный сервис по онлайн-бронированию отелей. Поиск отеля, отвечающего заданным характеристикам - это неотъемлемая часть бизнес-модели и основной инструмент для клиента.
При постоянном росте компании вопросу производительности и масштабируемости поиска уделяется много внимания. В результате за время своего существования архитектура поиска претерпела несколько глобальных переделок, начиная от простой базы в MySQL до многокомпонентного распределенного сервиса.
В своей текущей реинкарнации поиск в Booking.com состоит их трех подсистем:
1) сервис auto-complete и устранения неоднозначности (disambiguation) в геопозиции;
2) сервис поиска по отелям и проверки их доступности (availability);
3) система предрасчета цен.
Первые две системы - это высокопроизводительные приложения, написанные на Java. Сервис поиска хранит свои индексы в in-memory хранилище, а данные - во встраиваемой базе данных RocksDB. Логика системы предрасчета цен написана на Perl, а в качестве хранилища используется MySQL.
Приходите на мой доклад, и я расскажу вам, как эволюционировал поиск вместе с ростом компании. Мы подробно рассмотрим текущую архитектуру, и почему мы решили ее сделать именно такой. Ну и, конечно, с какими проблемами нам пришлось бороться и как мы это делали.
Bringing code to the data: from MySQL to RocksDB for high volume searchesIvan Kruglov
Searches are hard, fast searches are harder and even more with growing dataset. At Booking.com we face these problems, especially the last one: we have doubled the number of properties in the last two years. Searching across normalized data in MySQL stopped working for us 3-4 years ago. Optimizing the dataset in MySQL for searches recently began to showing its limits on large destinations like Paris, Italy or the Mediterranean. Join the talk to learn how we’re solving search problems by moving data from MySQL to RocksDB and bringing code to the data.
Sereal is a schema-less, cross-language binary serialization protocol aiming to replace Storable in Perl. It natively handles almost all Perl data structures including “special" ones such as aliases, weakrefs, regexps and cyclic references. Sereal supports compression and string de-duplication and employs other nice tricks allowing it to produce one of the most compact outputs across available serialization libraries. It is also very fast!
Sereal is widely adopted at Booking.com for serialization of objects for storage (MySQL, Redis, Memcached), as the default format for transferring data between different services, and for delivering logging data in our distributed environment.
Being very efficient in both speed and space Sereal::Encoder and Sereal::Decoder provide facilities which are not optimal in certain cases. For example many manipulations on encoded documents can be done only through re-serialization. To improve performance characteristics for these cases a set of tools was created:
- Sereal::Path - a query engine for Sereal allowing extraction of a single part of a document, similar to XPath and JSONPath.
- Sereal::Merger - a tool to combine several Sereal documents into a single one.
- Sereal::Splitter - a tool to split large documents into smaller ones.
In my talk I will give a quick introduction to Sereal and its internals. I’ll cover some design choices and important features. Finally, I’m going to explain the reasons of creating the tools and the best situations for using them.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
8. Event Graphite Processor.
• The dataset is huge and it’s growing
• Every second of events takes 10–15GB of RAM
• Monitors are split into groups to run faster
• Every group runs in a fork
• Forking provokes COW
• RAM is being saturated
• No RAM = the box is being kicked out
18. First results: promising.
• CPU: no changes
• Processing time: 60sec vs 30sec
• # of boxes: 20 vs 8
• RAM: 10GB vs 100MB
19. 1. RAM is an issue
2. New user monitors
3. New systems
4. More events every day
EGP: Time to act!
20. 1. Implement a proof of concept
2. Freeze EGP development
3. Migrate all monitors
4. Full-scale test
5. Roll out the new system
6. Profit!
EGP migration TODO
21. 1. 8 people
2. All done in 1 day
3. 260 monitors
4. 317 files changed,
10336 insertions,
11288 deletions
5. Ready to run a full-scale test
Migration.
Hackathon.
Results
28. 1. Processing time: 80sec vs 40sec
2. RAM: 16GB vs 500MB
3. # of boxes: 80 vs 30
The results.
The new system
29. 1. Engineering is the king,
collaboration is the queen
2. The ideas that
failed individually might
work together
3. Challenge everything
Lessons learned.