In this exciting and informative talk, presented at PgConf Sillicon Valley 2015, Konstantin cut through the theory to deliver a clear set of practical solutions for scaling applications atop PostgreSQL, eventually supporting millions of active users, tens of thousands concurrently, and with the application stack that responds to requests with a 100ms average. He will share how his team solved one of the biggest challenges they faced: effectively storing and retrieving over 3B rows of "saves" (a Wanelo equivalent of Instagram's "like" or Pinterest's "pin"), all in PostgreSQL, with highly concurrent random access.
Over the last three years, the team at Wanelo optimized the hell out of their application and database stacks. Using PostgreSQL version 9 as their primary data store, Joyent Public Cloud as a hosting environment, the team re-architected their backend for rapid expansion several times over, as the unrelenting traffic kept climbing up. This ultimately resulted in a highly efficient, horizontally scalable, fault tolerant application infrastructure. Unimpressed? Now try getting there without the OPS or DBA teams, all while deploying seven times per day to production, with an application measuring 99.999% uptime over the last 6 months.
Real-time Analytics with Trino and Apache PinotXiang Fu
Trino summit 2021:
Overview of Trino Pinot Connector, which bridges the flexibility of Trino's full SQL support to the power of Apache Pinot's realtime analytics, giving you the best of both worlds.
Interactive real time dashboards on data streams using Kafka, Druid, and Supe...DataWorks Summit
When interacting with analytics dashboards, in order to achieve a smooth user experience, two major key requirements are quick response time and data freshness. To meet the requirements of creating fast interactive BI dashboards over streaming data, organizations often struggle with selecting a proper serving layer.
Cluster computing frameworks such as Hadoop or Spark work well for storing large volumes of data, although they are not optimized for making it available for queries in real time. Long query latencies also make these systems suboptimal choices for powering interactive dashboards and BI use cases.
This talk presents an open source real time data analytics stack using Apache Kafka, Druid, and Superset. The stack combines the low-latency streaming and processing capabilities of Kafka with Druid, which enables immediate exploration and provides low-latency queries over the ingested data streams. Superset provides the visualization and dashboarding that integrates nicely with Druid. In this talk we will discuss why this architecture is well suited to interactive applications over streaming data, present an end-to-end demo of complete stack, discuss its key features, and discuss performance characteristics from real-world use cases.
Speaker
Nishant Bangarwa, Software Engineer, Hortonworks
Depuis quelques années Git s'est imposé comme le système de gestion de sources de référence. Simple, rapide , souple, il convient à la fois aux workflows traditionnels et aux workflows distribués. Après une introductions aux principes fondateurs de Git le stagiaire aura l'occasion d'apprendre à créer son premier repository git et à le manipuler localement avant de le partager avec les autres. La formation insistera sur la gestions des branches et la résolution des conflits et sur les commandes permettant d'aborder ces problématiques sereinement. A l'issue de cette formation le stagiaire sera autonome pour utiliser Git aussi bien individuellement qu'au sein d'une équipe.
InfluxDB Roadmap: What’s New and What’s ComingInfluxData
InfluxDB is the time series platform made for developers to build time-series-based applications quickly at scale. Discover why developers use InfluxDB to build real-time applications for analytics, IoT and cloud-native services in less time with less code. InfluxDB Cloud is a fast, elastic, serverless real-time monitoring platform, dashboarding engine, analytics service and event and metrics processor - it is available on AWS, Azure and Google Cloud.
Join Balaji Palani, Senior Director of Product Management, in this webinar as he demonstrates the the latest features of InfluxDB Cloud. This one-hour webinar will feature a product update and Q&A time.
Real-time Analytics with Trino and Apache PinotXiang Fu
Trino summit 2021:
Overview of Trino Pinot Connector, which bridges the flexibility of Trino's full SQL support to the power of Apache Pinot's realtime analytics, giving you the best of both worlds.
Interactive real time dashboards on data streams using Kafka, Druid, and Supe...DataWorks Summit
When interacting with analytics dashboards, in order to achieve a smooth user experience, two major key requirements are quick response time and data freshness. To meet the requirements of creating fast interactive BI dashboards over streaming data, organizations often struggle with selecting a proper serving layer.
Cluster computing frameworks such as Hadoop or Spark work well for storing large volumes of data, although they are not optimized for making it available for queries in real time. Long query latencies also make these systems suboptimal choices for powering interactive dashboards and BI use cases.
This talk presents an open source real time data analytics stack using Apache Kafka, Druid, and Superset. The stack combines the low-latency streaming and processing capabilities of Kafka with Druid, which enables immediate exploration and provides low-latency queries over the ingested data streams. Superset provides the visualization and dashboarding that integrates nicely with Druid. In this talk we will discuss why this architecture is well suited to interactive applications over streaming data, present an end-to-end demo of complete stack, discuss its key features, and discuss performance characteristics from real-world use cases.
Speaker
Nishant Bangarwa, Software Engineer, Hortonworks
Depuis quelques années Git s'est imposé comme le système de gestion de sources de référence. Simple, rapide , souple, il convient à la fois aux workflows traditionnels et aux workflows distribués. Après une introductions aux principes fondateurs de Git le stagiaire aura l'occasion d'apprendre à créer son premier repository git et à le manipuler localement avant de le partager avec les autres. La formation insistera sur la gestions des branches et la résolution des conflits et sur les commandes permettant d'aborder ces problématiques sereinement. A l'issue de cette formation le stagiaire sera autonome pour utiliser Git aussi bien individuellement qu'au sein d'une équipe.
InfluxDB Roadmap: What’s New and What’s ComingInfluxData
InfluxDB is the time series platform made for developers to build time-series-based applications quickly at scale. Discover why developers use InfluxDB to build real-time applications for analytics, IoT and cloud-native services in less time with less code. InfluxDB Cloud is a fast, elastic, serverless real-time monitoring platform, dashboarding engine, analytics service and event and metrics processor - it is available on AWS, Azure and Google Cloud.
Join Balaji Palani, Senior Director of Product Management, in this webinar as he demonstrates the the latest features of InfluxDB Cloud. This one-hour webinar will feature a product update and Q&A time.
Uber has one of the largest Kafka deployment in the industry. To improve the scalability and availability, we developed and deployed a novel federated Kafka cluster setup which hides the cluster details from producers/consumers. Users do not need to know which cluster a topic resides and the clients view a "logical cluster". The federation layer will map the clients to the actual physical clusters, and keep the location of the physical cluster transparent from the user. Cluster federation brings us several benefits to support our business growth and ease our daily operation. In particular, Client control. Inside Uber there are a large of applications and clients on Kafka, and it's challenging to migrate a topic with live consumers between clusters. Coordinations with the users are usually needed to shift their traffic to the migrated cluster. Cluster federation enables much control of the clients from the server side by enabling consumer traffic redirection to another physical cluster without restarting the application. Scalability: With federation, the Kafka service can horizontally scale by adding more clusters when a cluster is full. The topics can freely migrate to a new cluster without notifying the users or restarting the clients. Moreover, no matter how many physical clusters we manage per topic type, from the user perspective, they view only one logical cluster. Availability: With a topic replicated to at least two clusters we can tolerate a single cluster failure by redirecting the clients to the secondary cluster without performing a region-failover. This also provides much freedom and alleviates the risks for us to carry out important maintenance on a critical cluster. Before the maintenance, we mark the cluster as a secondary and migrate off the live traffic and consumers. We will present the details of the architecture and several interesting technical challenges we overcame.
Hyperspace is a recently open-sourced (https://github.com/microsoft/hyperspace) indexing sub-system from Microsoft. The key idea behind Hyperspace is simple: Users specify the indexes they want to build. Hyperspace builds these indexes using Apache Spark, and maintains metadata in its write-ahead log that is stored in the data lake. At runtime, Hyperspace automatically selects the best index to use for a given query without requiring users to rewrite their queries. Since Hyperspace was introduced, one of the most popular asks from the Spark community was indexing support for Delta Lake. In this talk, we present our experiences in designing and implementing Hyperspace support for Delta Lake and how it can be used for accelerating queries over Delta tables. We will cover the necessary foundations behind Delta Lake’s transaction log design and how Hyperspace enables indexing support that seamlessly works with the former’s time travel queries.
Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel AvivAmazon Web Services
Elastic Load Balancing automatically distributes incoming application traffic across multiple Amazon EC2 instances for fault tolerance and load distribution. In this session, we go into detail about Elastic Load Balancing's configuration and day-to-day management, as well as its use in conjunction with Auto Scaling. We explain how to make decisions about the service's many customization choices. We also share best practices and useful tips for success.
Kafka Connect: Real-time Data Integration at Scale with Apache Kafka, Ewen Ch...confluent
Many companies are adopting Apache Kafka to power their data pipelines, including LinkedIn, Netflix, and Airbnb. Kafka’s ability to handle high throughput real-time data makes it a perfect fit for solving the data integration problem, acting as the common buffer for all your data and bridging the gap between streaming and batch systems.
However, building a data pipeline around Kafka today can be challenging because it requires combining a wide variety of tools to collect data from disparate data systems. One tool streams updates from your database to Kafka, another imports logs, and yet another exports to HDFS. As a result, building a data pipeline can take significant engineering effort and has high operational overhead because all these different tools require ongoing monitoring and maintenance. Additionally, some of the tools are simply a poor fit for the job: the fragmented nature of the data integration tools ecosystem lead to creative but misguided solutions such as misusing stream processing frameworks for data integration purposes.
We describe the design and implementation of Kafka Connect, Kafka’s new tool for scalable, fault-tolerant data import and export. First we’ll discuss some existing tools in the space and why they fall short when applied to data integration at large scale. Next, we will explore Kafka Connect’s design and how it compares to systems with similar goals, discussing key design decisions that trade off between ease of use for connector developers, operational complexity, and reuse of existing connectors. Finally, we’ll discuss how standardizing on Kafka Connect can ultimately lead to simplifying your entire data pipeline, making ETL into your data warehouse and enabling stream processing applications as simple as adding another Kafka connector.
eventbrite_kafka_summit_event_logo_v3-035858-edited.png
빅데이터 개념 부터 시작해서 빅데이터 분석 플랫폼의 출현(hadoop)과 스파크의 등장배경까지 풀어서 작성된 spark 소개 자료 입니다.
스파크는 RDD에 대한 개념과 spark SQL 라이브러리에 대한 자료가 조금 자세히 설명 되어있습니다. (텅스텐엔진, 카탈리스트 옵티마이져에 대한 간략한 설명이 있습니다.)
마지막에는 간단한 설치 및 interactive 분석 실습자료가 포함되어 있습니다.
원본 ppt 를 공개해 두었으니 언제 어디서든 필요에 따라 변형하여 사용하시되 출처만 잘 남겨주시면 감사드리겠습니다.
다른 슬라이드나, 블로그에서 사용된 그림과 참고한 자료들은 작게 출처를 표시해두었는데, 본 ppt의 초기버전을 작성하면서 찾았던 일부 자료들은 출처가 불분명한 상태입니다. 자료 출처를 알려주시면 반영하여 수정해 두도록하겠습니다. (제보 부탁드립니다!)
Little Big Data #1 다양한 사람들의 데이터 사이언스 이야기에서 발표한 자료입니다
궁금한 것은 언제나 문의주세요 :)
행사 후기는 https://zzsza.github.io/etc/2018/04/21/little-big-data/ 에 있습니다!
(2018.5 내용 추가) 현재 회사가 없으니, 제게 관심있으신 분들도 연락 환영합니다 :)
Watch this talk here: https://www.confluent.io/online-talks/apache-kafka-architecture-and-fundamentals-explained-on-demand
This session explains Apache Kafka’s internal design and architecture. Companies like LinkedIn are now sending more than 1 trillion messages per day to Apache Kafka. Learn about the underlying design in Kafka that leads to such high throughput.
This talk provides a comprehensive overview of Kafka architecture and internal functions, including:
-Topics, partitions and segments
-The commit log and streams
-Brokers and broker replication
-Producer basics
-Consumers, consumer groups and offsets
This session is part 2 of 4 in our Fundamentals for Apache Kafka series.
Hybrid Apache Spark Architecture with YARN and KubernetesDatabricks
Lyft is on the mission to improve people’s lives with the world’s best transportation. Starting 2019, Lyft has been running both Batch ETL and ML spark workloads primarily on Kubernetes with the Apache Spark on k8s operator. However, with the increasing scale of workloads in frequency and resource requirements, we started hitting numerous reliability issues related to IP allocation, container images, IAM role assignment, and Kubernetes Control Plane.
To continue supporting growing Spark usage with Lyft, the team came up with a hybrid architecture optimized for containerized and non-containerized workload based on Kubernetes and YARN. In this talk, we will also cover a dynamic runtime controller that helps with per environment config overrides and easy switchover between resource managers.
Spark 의 핵심은 무엇인가? RDD! (RDD paper review)Yongho Ha
요즘 Hadoop 보다 더 뜨고 있는 Spark.
그 Spark의 핵심을 이해하기 위해서는 핵심 자료구조인 Resilient Distributed Datasets (RDD)를 이해하는 것이 필요합니다.
RDD가 어떻게 동작하는지, 원 논문을 리뷰하며 살펴보도록 합시다.
http://www.cs.berkeley.edu/~matei/papers/2012/sigmod_shark_demo.pdf
Stands for Command Query Responsibility Segregation and Event Sourcing.
● When combined together they allow us to build more resilient and elastic
hence reactive applications.
● Used for building auditable, scalable, or extremely resilient applications.
● Gives our application a competitive advantage.
Uber has one of the largest Kafka deployment in the industry. To improve the scalability and availability, we developed and deployed a novel federated Kafka cluster setup which hides the cluster details from producers/consumers. Users do not need to know which cluster a topic resides and the clients view a "logical cluster". The federation layer will map the clients to the actual physical clusters, and keep the location of the physical cluster transparent from the user. Cluster federation brings us several benefits to support our business growth and ease our daily operation. In particular, Client control. Inside Uber there are a large of applications and clients on Kafka, and it's challenging to migrate a topic with live consumers between clusters. Coordinations with the users are usually needed to shift their traffic to the migrated cluster. Cluster federation enables much control of the clients from the server side by enabling consumer traffic redirection to another physical cluster without restarting the application. Scalability: With federation, the Kafka service can horizontally scale by adding more clusters when a cluster is full. The topics can freely migrate to a new cluster without notifying the users or restarting the clients. Moreover, no matter how many physical clusters we manage per topic type, from the user perspective, they view only one logical cluster. Availability: With a topic replicated to at least two clusters we can tolerate a single cluster failure by redirecting the clients to the secondary cluster without performing a region-failover. This also provides much freedom and alleviates the risks for us to carry out important maintenance on a critical cluster. Before the maintenance, we mark the cluster as a secondary and migrate off the live traffic and consumers. We will present the details of the architecture and several interesting technical challenges we overcame.
Hyperspace is a recently open-sourced (https://github.com/microsoft/hyperspace) indexing sub-system from Microsoft. The key idea behind Hyperspace is simple: Users specify the indexes they want to build. Hyperspace builds these indexes using Apache Spark, and maintains metadata in its write-ahead log that is stored in the data lake. At runtime, Hyperspace automatically selects the best index to use for a given query without requiring users to rewrite their queries. Since Hyperspace was introduced, one of the most popular asks from the Spark community was indexing support for Delta Lake. In this talk, we present our experiences in designing and implementing Hyperspace support for Delta Lake and how it can be used for accelerating queries over Delta tables. We will cover the necessary foundations behind Delta Lake’s transaction log design and how Hyperspace enables indexing support that seamlessly works with the former’s time travel queries.
Elastic Load Balancing Deep Dive and Best Practices - Pop-up Loft Tel AvivAmazon Web Services
Elastic Load Balancing automatically distributes incoming application traffic across multiple Amazon EC2 instances for fault tolerance and load distribution. In this session, we go into detail about Elastic Load Balancing's configuration and day-to-day management, as well as its use in conjunction with Auto Scaling. We explain how to make decisions about the service's many customization choices. We also share best practices and useful tips for success.
Kafka Connect: Real-time Data Integration at Scale with Apache Kafka, Ewen Ch...confluent
Many companies are adopting Apache Kafka to power their data pipelines, including LinkedIn, Netflix, and Airbnb. Kafka’s ability to handle high throughput real-time data makes it a perfect fit for solving the data integration problem, acting as the common buffer for all your data and bridging the gap between streaming and batch systems.
However, building a data pipeline around Kafka today can be challenging because it requires combining a wide variety of tools to collect data from disparate data systems. One tool streams updates from your database to Kafka, another imports logs, and yet another exports to HDFS. As a result, building a data pipeline can take significant engineering effort and has high operational overhead because all these different tools require ongoing monitoring and maintenance. Additionally, some of the tools are simply a poor fit for the job: the fragmented nature of the data integration tools ecosystem lead to creative but misguided solutions such as misusing stream processing frameworks for data integration purposes.
We describe the design and implementation of Kafka Connect, Kafka’s new tool for scalable, fault-tolerant data import and export. First we’ll discuss some existing tools in the space and why they fall short when applied to data integration at large scale. Next, we will explore Kafka Connect’s design and how it compares to systems with similar goals, discussing key design decisions that trade off between ease of use for connector developers, operational complexity, and reuse of existing connectors. Finally, we’ll discuss how standardizing on Kafka Connect can ultimately lead to simplifying your entire data pipeline, making ETL into your data warehouse and enabling stream processing applications as simple as adding another Kafka connector.
eventbrite_kafka_summit_event_logo_v3-035858-edited.png
빅데이터 개념 부터 시작해서 빅데이터 분석 플랫폼의 출현(hadoop)과 스파크의 등장배경까지 풀어서 작성된 spark 소개 자료 입니다.
스파크는 RDD에 대한 개념과 spark SQL 라이브러리에 대한 자료가 조금 자세히 설명 되어있습니다. (텅스텐엔진, 카탈리스트 옵티마이져에 대한 간략한 설명이 있습니다.)
마지막에는 간단한 설치 및 interactive 분석 실습자료가 포함되어 있습니다.
원본 ppt 를 공개해 두었으니 언제 어디서든 필요에 따라 변형하여 사용하시되 출처만 잘 남겨주시면 감사드리겠습니다.
다른 슬라이드나, 블로그에서 사용된 그림과 참고한 자료들은 작게 출처를 표시해두었는데, 본 ppt의 초기버전을 작성하면서 찾았던 일부 자료들은 출처가 불분명한 상태입니다. 자료 출처를 알려주시면 반영하여 수정해 두도록하겠습니다. (제보 부탁드립니다!)
Little Big Data #1 다양한 사람들의 데이터 사이언스 이야기에서 발표한 자료입니다
궁금한 것은 언제나 문의주세요 :)
행사 후기는 https://zzsza.github.io/etc/2018/04/21/little-big-data/ 에 있습니다!
(2018.5 내용 추가) 현재 회사가 없으니, 제게 관심있으신 분들도 연락 환영합니다 :)
Watch this talk here: https://www.confluent.io/online-talks/apache-kafka-architecture-and-fundamentals-explained-on-demand
This session explains Apache Kafka’s internal design and architecture. Companies like LinkedIn are now sending more than 1 trillion messages per day to Apache Kafka. Learn about the underlying design in Kafka that leads to such high throughput.
This talk provides a comprehensive overview of Kafka architecture and internal functions, including:
-Topics, partitions and segments
-The commit log and streams
-Brokers and broker replication
-Producer basics
-Consumers, consumer groups and offsets
This session is part 2 of 4 in our Fundamentals for Apache Kafka series.
Hybrid Apache Spark Architecture with YARN and KubernetesDatabricks
Lyft is on the mission to improve people’s lives with the world’s best transportation. Starting 2019, Lyft has been running both Batch ETL and ML spark workloads primarily on Kubernetes with the Apache Spark on k8s operator. However, with the increasing scale of workloads in frequency and resource requirements, we started hitting numerous reliability issues related to IP allocation, container images, IAM role assignment, and Kubernetes Control Plane.
To continue supporting growing Spark usage with Lyft, the team came up with a hybrid architecture optimized for containerized and non-containerized workload based on Kubernetes and YARN. In this talk, we will also cover a dynamic runtime controller that helps with per environment config overrides and easy switchover between resource managers.
Spark 의 핵심은 무엇인가? RDD! (RDD paper review)Yongho Ha
요즘 Hadoop 보다 더 뜨고 있는 Spark.
그 Spark의 핵심을 이해하기 위해서는 핵심 자료구조인 Resilient Distributed Datasets (RDD)를 이해하는 것이 필요합니다.
RDD가 어떻게 동작하는지, 원 논문을 리뷰하며 살펴보도록 합시다.
http://www.cs.berkeley.edu/~matei/papers/2012/sigmod_shark_demo.pdf
Stands for Command Query Responsibility Segregation and Event Sourcing.
● When combined together they allow us to build more resilient and elastic
hence reactive applications.
● Used for building auditable, scalable, or extremely resilient applications.
● Gives our application a competitive advantage.
A collection of problems and solutions worked through Wanelo team as they were scaling the site with the rapid demand of users. By Konstantin Gredeskoul and Eric Saxby.
Do you need Ops in your new startup? If not now, then when? And...what is Ops?
Learn how to scale ruby-based distributed software infrastructure in the cloud to serve 4,000 requests per second, handle 400 updates per second, and achieve 99.97% uptime – all while building the product at the speed of light.
Unimpressed? Now try doing the above altogether without the Ops team, while growing your traffic 100x in 6 months and deploying 5-6 times a day!
It could be a dream, but luckily it's a reality that could be yours.
How open source is driving DevOps innovation: CloudOpen NA 2015Gordon Haff
It’s no coincidence that all the interest around DevOps today comes at a time when open source technologies and processes are so dominant in cloud computing, data storage and analysis, and--increasingly--in networking. Innovations in Linux and other projects, including containers, configuration management, and continuous integration, are what make DevOps workflows and portable application deployments possible. But it’s also the result of open source culture, practices, and the tools supporting those practices that have made iterative development and collaboration such a powerful model for creating great software in communities. And now, they’re also providing a template for how to develop and operate applications internally within enterprises. In this session, we will discuss how open source tools and practices can be applied to create effective DevOps workflows and practices.
Gerenciamento de Backups PostgreSQL com pgbarmanJuliano Atanazio
Tutorial desde a instalação a aplicações práticas do pgbarman.
Apresentação feita originalmente como palestra no evento nacional de PostgreSQL (PgBr 2013.
What does it mean to be CTO of an IOT startup? What do you need to know? How a CTO feels?
You can see it live from the withthebest plateform with a lot of other interesting video
https://iot-platform.withthebest.com/#/replay
or here https://youtu.be/ciq0YOCOVvQ?t=2m2s
A crash course in the essentials of how to think like a startup CTO; a set of tools you can use to drive to the 'why' of the essential tech decisions you'll need to make. It's not 'to microservice or not to microservice?', it's 'WHY microservices?', and explaining that answer in terms investors and business stakeholders will understand and agree with.
Safely Drinking from the Data WaterhoseJan Schaumann
An ingite talk given at DataGotham 2012 about how we extract security related events and alerts from our logs. I repeated the same talk at DevOpsDays NYC 2013.
How to use any static site generator with GitLab Pages. Ivan Nemytchenko
It is possible to delegate all the routine work of static site generation to GitLab Pages. It is possible because of built-in CI service.
March, 19 2016, Kiev, Ruby Meditation
Managing Large Flask Applications On Google App Engine (GAE)Emmanuel Olowosulu
There are a number of issues production applications need to solve to be scalable and fault tolerant. In this talk, we explore some tips for efficiently running Python apps, particularly with Flask, on App Engine. We also share some collective experience and best practices on GAE.
redpill Mobile Case Study (Salvation Army)Peter Presnell
Case study that summarizes key findings by Red Pill Development as they built a mobile interface for Notes applications at Salvation Army. Using asymmetric modernization a mobile interface can be delivered for an entire portfolio of applications in a few days.
Andrea Baldon, Emanuele Di Saverio - GraphQL for Native Apps: the MyAXA case ...Codemotion
Dubbed with prominent descriptions like "REST done right" GraphQL, released by Facebook in 2015, is a technology quickly gaining adoption in the web space. How about mobile? In this talk we will discuss our "from the trenches" experience in adopting GraphQL in the context of designing of one of the biggest insurance companies in Italy and worldwide, MyAXA. We will discuss features of the protocol and query languages. including most popular implementations on the Android, iOS and NodeJS sides, and expanding on the best practices to squeeze the most value from this innovative approach.
Geospatial Intelligence Middle East 2013_Big Data_Steven RamageSteven Ramage
Some initial considerations and discussion points around geospatial big data. Location adds context and relevance. Need to consider a number of V factors including Value.
Watch full webinar here: https://bit.ly/3mdj9i7
You will often hear that "data is the new gold"? In this context, data management is one of the areas that has received more attention from the software community in recent years. From Artificial Intelligence and Machine Learning to new ways to store and process data, the landscape for data management is in constant evolution. From the privileged perspective of an enterprise middleware platform, we at Denodo have the advantage of seeing many of these changes happen.
In this webinar, we will discuss the technology trends that will drive the enterprise data strategies in the years to come. Don't miss it if you want to keep yourself informed about how to convert your data to strategic assets in order to complete the data-driven transformation in your company.
Watch this on-demand webinar as we cover:
- The most interesting trends in data management
- How to build a data fabric architecture?
- How to manage your data integration strategy in the new hybrid world
- Our predictions on how those trends will change the data management world
- How can companies monetize the data through data-as-a-service infrastructure?
- What is the role of voice computing in future data analytic
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...InfluxData
In this presentation, you'll learn how InfluxDB is a component to Storj’s Tardigrade service and workflows. John Gleeson and Ben Sirb of Storj Lab will Storj’s redefinition of a cloud object storage network, how InfluxData fits into Storj’s Open Source Partner Program, and how to collect and manage high-volume, real-time telemetry data from a distributed network.
History of NoSQL and Azure Documentdb feature setSoner Altin
Short history of database systems from DBMS, RDBMS to NoSQL solutions. Introduction to SQL query support of Azure DocumentDB and integrating DocumentDB with simple Java application from Maven repository.
In software engineering, the right architecture is essential for robust, scalable platforms. Wix has undergone a pivotal shift from event sourcing to a CRUD-based model for its microservices. This talk will chart the course of this pivotal journey.
Event sourcing, which records state changes as immutable events, provided robust auditing and "time travel" debugging for Wix Stores' microservices. Despite its benefits, the complexity it introduced in state management slowed development. Wix responded by adopting a simpler, unified CRUD model. This talk will explore the challenges of event sourcing and the advantages of Wix's new "CRUD on steroids" approach, which streamlines API integration and domain event management while preserving data integrity and system resilience.
Participants will gain valuable insights into Wix's strategies for ensuring atomicity in database updates and event production, as well as caching, materialization, and performance optimization techniques within a distributed system.
Join us to discover how Wix has mastered the art of balancing simplicity and extensibility, and learn how the re-adoption of the modest CRUD has turbocharged their development velocity, resilience, and scalability in a high-growth environment.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
Launch Your Streaming Platforms in MinutesRoshan Dwivedi
The claim of launching a streaming platform in minutes might be a bit of an exaggeration, but there are services that can significantly streamline the process. Here's a breakdown:
Pros of Speedy Streaming Platform Launch Services:
No coding required: These services often use drag-and-drop interfaces or pre-built templates, eliminating the need for programming knowledge.
Faster setup: Compared to building from scratch, these platforms can get you up and running much quicker.
All-in-one solutions: Many services offer features like content management systems (CMS), video players, and monetization tools, reducing the need for multiple integrations.
Things to Consider:
Limited customization: These platforms may offer less flexibility in design and functionality compared to custom-built solutions.
Scalability: As your audience grows, you might need to upgrade to a more robust platform or encounter limitations with the "quick launch" option.
Features: Carefully evaluate which features are included and if they meet your specific needs (e.g., live streaming, subscription options).
Examples of Services for Launching Streaming Platforms:
Muvi [muvi com]
Uscreen [usencreen tv]
Alternatives to Consider:
Existing Streaming platforms: Platforms like YouTube or Twitch might be suitable for basic streaming needs, though monetization options might be limited.
Custom Development: While more time-consuming, custom development offers the most control and flexibility for your platform.
Overall, launching a streaming platform in minutes might not be entirely realistic, but these services can significantly speed up the process compared to building from scratch. Carefully consider your needs and budget when choosing the best option for you.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Utilocate offers a comprehensive solution for locate ticket management by automating and streamlining the entire process. By integrating with Geospatial Information Systems (GIS), it provides accurate mapping and visualization of utility locations, enhancing decision-making and reducing the risk of errors. The system's advanced data analytics tools help identify trends, predict potential issues, and optimize resource allocation, making the locate ticket management process smarter and more efficient. Additionally, automated ticket management ensures consistency and reduces human error, while real-time notifications keep all relevant personnel informed and ready to respond promptly.
The system's ability to streamline workflows and automate ticket routing significantly reduces the time taken to process each ticket, making the process faster and more efficient. Mobile access allows field technicians to update ticket information on the go, ensuring that the latest information is always available and accelerating the locate process. Overall, Utilocate not only enhances the efficiency and accuracy of locate ticket management but also improves safety by minimizing the risk of utility damage through precise and timely locates.
Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics.
To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Mind IT Systems
Healthcare providers often struggle with the complexities of chronic conditions and remote patient monitoring, as each patient requires personalized care and ongoing monitoring. Off-the-shelf solutions may not meet these diverse needs, leading to inefficiencies and gaps in care. It’s here, custom healthcare software offers a tailored solution, ensuring improved care and effectiveness.
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Navigating the Metaverse: A Journey into Virtual Evolution"Donna Lenk
Join us for an exploration of the Metaverse's evolution, where innovation meets imagination. Discover new dimensions of virtual events, engage with thought-provoking discussions, and witness the transformative power of digital realms."
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxrickgrimesss22
Discover the essential features to incorporate in your Winzo clone app to boost business growth, enhance user engagement, and drive revenue. Learn how to create a compelling gaming experience that stands out in the competitive market.
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
Developing Distributed High-performance Computing Capabilities of an Open Sci...
From Obvious to Ingenius: Incrementally Scaling Web Apps on PostgreSQL
1. Konstantin Gredeskoul, CTO, Wanelo.com @kig
@kigster
@kig
THIS IS AN UPDATED AND REFRESHED VERSION V2.0 OF THE ORIGINAL TALK PRESENTED AT SF POSTGRESQL USER GROUP, AND TITLED
“12-STEP PROGRAM FOR POSTGRESQL-BASED WEB APPLICATIONS PERFORMANCE”
THE LOCATION OF THIS POPULAR SLIDESHARE IS AT THE FOLLOWING LINK.
FROM OBVIOUS TO INGENIOUS
INCREMENTALLY SCALING WEB APPLICATIONS ON POSTGRESQL
2. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
• Relational Databases:
PostgreSQL, MySQL, Oracle, SQL Server
have been around for decades. They are
flexible, performant, and widely supported.
Relational DBMSes represent massive 81%
of all data stores surveyed by db-
engines.com.
• Document Stores:
MongoDB, CouchDB, Amazon DynamoDB,
Couchbase
• Key Value Stores:
Redis, Memcached, Riak, DynamoDB
2
DATA STORE TYPES: OVERVIEW
* Based on db-engines.com/en/ranking as of December 2015
• Wide Column Stores:
Cassandra, HBase, Accumulo, Hypertable
• Search Engines:
Solr, Elasticsearch, Splunk, Sphinx
• Graph DBMS:
Neo4j, Titan, OrientDB
• Time Series DBMS:
RRDtool, InfluxDB, Graphite.
• Also exist: RDF stores, Object-Oriented, XMLDB,
Content Stores, Navigational DBMS
3. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page 3
DATA STORE TYPES: MARKET SHARE
4. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page 4
DATA STORE TYPES: CHANGE OVER TIME
• Even though Graph DBMSes show the largest increase over time,
they account for mere 0.8% of the total market share.
5. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
• OLTP (Online Transaction Processing)
These are the typical web applications with massive number of users, performing various operations concurrently.
Examples include online stores, social networks, google, etc. – are all OLTP applications. They require huge
throughput of small transactions, required to be as fast as possible (otherwise users leave), and achieve the speed
by having most of the “live” data cached.
• Analytics
Small number of users (analysts) running very long-running reports across the entire data sets, that are typically
much much larger than what would fit into RAM
• Backend Processing
Somewhere in between the two, backend applications are typically processing large amounts of data for either
import/export, transformations, synchronization and updates.
These apps have almost no users (just admins), and push their data store to the limit. But since it does not have real
users, speed of operations only affects application’s overall throughput.
5
WEB APPS OF DISTINCTLY DIFFERENT TYPES
6. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
• In this presentation we’ll make an assumption that we are building a massive
concurrent multi-user web application (using ruby, ruby on rails, and other tools).
• This type of load is typically called “OLTP”, meaning online transaction
processing.
• In plain English, this means that our database will be getting high throughput of
concurrent requests on behalf of each session for each user working with an application
at any given time.
• Sessions may be initiated from the web by users or admins, or from the mobile app by
mobile users.
• Users want their app to be very responsive, and they leave when it isn’t. Therefore OLTP
applications need to be both fast (performance) and support many users (scalable).
• We are NOT going to be addressing the needs of Analytics or Backend
Processing Applications, which have only a few or no users.
CASE IN POINT: APP ASSUMPTIONS
6
7. LOW HANGING FRUIT
2. SCALING UP
IN THE TRENCHES
3. SCALING OUT
AND PERFORMANCE
1. SCALABILITY
What to choose for data store on
a new application?
Relational data model
Structured vs Unstructured
Scalability vs performance
Understanding latency
Foundations of web architecture
First signs of scaling issues:
Too many database reads
Too many database writes
1. Caching
2. Fixing slow SQL
Optimization example
3. Setting up streaming
replication, and doing read/
write splitting
4. Upgrading hardware
5. Where not to use
PostgreSQL
6. Do not store append-only
event data in PostgreSQL
7. Tune DB & filesystem
8. Buffer and serialize frequent
updates to the same row
9. Optimize schema for scale
10.Vertically shard busy tables
11.Move vertically shared tables
behind micro-services
12.Horizontally shard data-store
behind micro-service.
Conclusions and final thoughts.
Thanks & contact Info.
9. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
• If your application starts with a small data-set,
then relational database will give you the
most flexibility, while enabling high
productivity software like Rails.
• PostgreSQL is not only a safe choice, but a
great choice for new applications due to it’s
unprecedented versatility, speed, and
reliability.
• Where it falls short, compared to some of the
more specialized storage software, is massive
horizontal scalability.
9
STARTING NEWAPPLICATION,WHATTO USE?
10. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
• Overwhelming majority of common web
application data is structured. As in, we know pretty
well it’s properties (columns) in advance – such as
user.firstname, etc.
• Structured data is very effectively represented by
the relational model developed in 1969 by E.F.
(Ted) Codd.
• Relational model is mathematically complete, and
in practice excellent for mapping almost any
domain, with very few exceptions – in the areas of
time series, directional graphs, and full-text search.
STRUCTURED DATA
VS UNSTRUCTURED
10
11. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
• For the last few years there has been a lot of hype
surrounding “document” databases, in particular MongoDB.
• MongoDB marketing appears to be set to “kill” (or replace)
relational databases. Not only this will very unlikely to occur,
but it frames the discussion in a very wrong way: one OR the
other, while down the road it’s likely to be both.
• PostgreSQL has been continually growing in the area of
non-structured capabilities: it now supports JSON, HSTORE
and XML data types natively and very well.
STRUCTURED DATA
VS UNSTRUCTURED
11
12. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
SCALABILITY: IS THE CAPABILITY OF A SYSTEM, NETWORK, OR
PROCESS TO HANDLE A GROWING AMOUNT OF WORK, OR ITS
POTENTIAL TO BE ENLARGED IN ORDER TO ACCOMMODATE
THAT GROWTH.
12
13. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
PERFORMANCE (LATENCY): GENERALLY DESCRIBES THE TIME IT
TAKES FOR VARIOUS OPERATIONS TO COMPLETE: I.E. USER
INTERFACES TO LOAD, OR BACKGROUND JOBS TO COMPLETE.
PERFORMANCE & SCALABILITY ARE RELATED.
13
16. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
Founded in 2010, Wanelo (“wah-nee-
loh,” from Want, Need, Love) is a mall
on your phone. It helps you find,
bookmark (“save”) the quirkiest
products in the online universe.
A regular mall has 150 stores, but
Wanelo has 550,000 stores which
include all the big brands you know,
as well as tiny independent boutiques.
CASE STUDY
16
In 2013 traffic to Wanelo went from 2,000 requests
per minute, to 200,000 in about six months period
of a true exponential hyper-growth.
18. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page 18
EARLY ENGINEERING GOALS
• Move as fast as possible with
product development. We call it
“Aggro-Agile”™
• Scale the app as needed, but
invest into small and cheap
things today, that will save us a
lot more time tomorrow
• Stay ahead of the growth curve by
closely monitoring application.
• Keep overall costs low (stay lean,
keep app fast)
• Spend $$ where it matters the
most: to save precious and
expensive developer time
• As a result, we took advantage of
a large number of open source
tools and paid services, which
allowed us to move fast.
20. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
20
OF MODERN WEB ARCHITECTURE
FOUNDATIONS
21. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
• app server (we use unicorn)
• scalable web server in front (we use nginx)
• database (we use postgresql)
• hosting environment (we use Joyent Cloud)
• deployment tools (capistrano)
• server configuration tools (we use chef)
• programming language + framework (RoR)
• many others, such as monitoring, alerting
21
FOUNDATIONAL TECHNOLOGIES
22. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
LET’S REVIEW – SUPER SIMPLE APP
/var/pgsql/data
incoming
http
PostgreSQL
Server
/home/user/app/current/public
nginx Unicorn / Passenger
Ruby VM
N x Unicorns
Ruby VM
• no redundancy, no caching (yet)
• can only process N concurrent requests
• nginx will serve static assets, deal with slow clients
• web sessions probably in the DB or cookie
22
23. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
AJAXIFY: DO THIS EARLY,HARD TO ADD LATER.
Proprietary and
• Personalization via AJAX, so controller actions can
be cached entirely using caches_action
• Page returned unpersonalized, additional AJAX
request loads personalization
23
24. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
• Install 2+ memcached servers for caching and
use Dalli gem to connect to it for redundancy
• Switch to using memcached-based web sessions.
Use sessions sparingly, assume transient nature
• Redis is also an option for sessions, but it’s not as easy
to use two redis instances for redundancy, as easily as
using memcached with Dalli
• Setup CDN for asset_host and any user
generated content. We use fastly.com
Proprietary and
24
DON’T SHOOTYOURSELF IN THE FOOT! DO THIS.
/var/pgsql/data
incoming
http
PostgreSQL
Server
/home/user/app/current/public
nginx Unicorn / Passenger
Ruby VM
N x Unicorns
Ruby VM
PostgreSQL
Server
/home/user/app/current/public
nginx Unicorn / Passenger
Ruby VM
N x Unicorns
Ruby VM
memcachedCDN
cache images, JS
➡
25. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
browser PostgreSQL
Server
/home/user/app/current/public
nginx Unicorn / Passenger
Ruby VM
N x Unicorns
Ruby VM
memcachedCDN
cache images, JS
ADD CACHING: CDN AND MEMCACHED
• geo distribute and cache your UGC and CSS/JS
• cache html and serialize objects in
• can increase TTL to alleviate load, if traffic
25
26. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
SIDENOTE: REMOVE SINGLE POINT OF FAILURE
• Multiple load balancers require DNS
round robin and short TTL
(dnsmadeeasy.com)
• Multiple long-running tasks (such as
posting to Facebook or Twitter)
require background job processing
framework
• Multiple app servers require haproxy
between nginx and unicorn
26
27. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
PostgreSQL
Unicorn / Passenger
Ruby VM (times N)
haproxy
incoming http
DNS round robin
or failover / HA solution
nginx
memcached
redis
CDN
cache images, JS
Load Balancers
App Servers
single DB
Object Store
User Generated
Content
Sidekiq / Resque
Background WorkersData stores
Transient to
Permanent
• This architecture can horizontally
scale our as far the database at
it’s center
• Every other component can be scaled
by adding more of it, to handle more
traffic
27
28. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
TRAFFIC CLIMB IS RELENTLESS
And it keeps climbing, sending our servers into a tailspin…
28
29. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
• Pages load slowly or timeout
• Users are getting 503 Service
Unavailable
• Database is slammed (very high CPU
or read IO)
• Some pages load (cached?), some
don’t
29
FIRST SIGNS OF READ SCALABILITY PROBLEMS
30. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
• Database write IO is maxed out,
CPU is not
• Updates are waiting on each other,
piling up
• Application “locks up”, timeouts
• Replicas are not catching up*
30
FIRST SIGNS OF WRITE SCALABILITY PROBLEMS
* More about replicas and how to observe them “catching up” is further down.
31. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page 31
BOTH SITUATIONS MAY EASILY RESULT IN DOWNTIME
32. OUR USERS NOTICED IN SECONDS…
Even though we achieved 99.99% uptime in
2013, in 2014 we had a couple short
downtimes caused by an overloaded (by too
many read requests) PostgreSQL replica.
34. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
34
1.THE MOST IMPORTANTTHINGS FIRST: CACHING
SCALING UP
35. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
• Anything that can be cached, should be
• Cache hit = many database hits avoided
• Hit rate of 17% still saves DB hits
• We can cache many types of things…
• Cache is cheap and fast (memcached)
35
CACHING 101
36. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
CACHE MANY TYPES OF THINGS
• caches_action in controllers is very effective
• fragment caches of reusable widgets
• we use gem Compositor for JSON API.
• We cache serialized object fragments, grab them
from memcached using multi_get and merge them
• Our gem “CacheObject” provides very simple and
clever layer within Ruby on Rails framework.
git clone https://github.com/wanelo/compositor
git clone https://github.com/wanelo/cache-object
36
37. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
BUT EXPIRING CACHE IS NOT ALWAYS EASY
• Easiest way to expire cache is to wait for it
to expire (by setting a TTL ahead of time).
But that’s not always possible (ie.
sometimes an action requires wiping the
cache, and it’s not acceptable to wait)
• CacheSweepers in Rails help
• Can and should expiring caches in
background jobs as it might take time.
• Can cache pages, fragments and JSON
using CDN!
37
38. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
MOBILE API,CDN AND CACHING TRICK
• All API responses that point to other API
responses must always use fully qualified
URLs (ie. next_page, etc)
• Multi-page grids can start to be fetched
from: api.example.com
• Second and subsequent pages can be
served from api-cdn.example.com
• If CDN is down, small change to
configuration and mobile apps are sending
all traffic to the source (api.example.com)
38
39. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
39
2.FINDING AND OPTIMIZING SLOW SQL
SCALING UP
40. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
SQL OPTIMIZATION: LOG SLOW QUERIES
• Find slow SQL (>100ms) and either remove it, cache the hell out of it,
or fix/rewrite the query
• Enable slow query log in postgresql.conf (as well as locks, and temp
files). These are of the types of things you need to know about.
40
41. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
TRACKING MOST TIME CONSUMING SQL
41
• The pg_stat_statements module provides a means for tracking execution statistics of all SQL
statements executed by a server.
• The module must be loaded by adding pg_stat_statements to shared_preload_libraries in
postgresql.conf, because it requires additional shared memory. This means that a server
restart is needed to add or remove the module.
42. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
FIXING SLOW QUERY:
Proprietary and
• Run explain plan to understand how DB runs the query
using “explain analyze <query>”.
• Are there adequate indexes for the query? Is the database
using appropriate index? Has the table been recently
analyzed?
• Can a complex join be simplified into a subselect?
• Can this query use an index-only scan?
• Can a column being sorted on be added to the index?
• What can we learn from watching the data in the two tables
pg_stat_user_tables and pg_stat_user_indexes?
• We could discover that the application is doing many sequential
scans, has several unused indexes, that take up space and slow down
“inserts” and much more.
42
pg_stat_user_tables
43. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
SQL OPTIMIZATION,CTD
Proprietary and
Instrumentation software such as NewRelic shows slow queries, with explain plans, and time consuming transactions
43
44. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
44
2.FINDING AND OPTIMIZING SLOW SQL
SCALING UPFIXING A QUERY: AN EXAMPLE
45. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
ONE DAY,I NOTICED LOTS OF TEMP FILES
45
created in the postgres.log
46. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
LET’S RUN THIS QUERY…
This join takes a whole second to return :(
46
47. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
FOLLOWS TABLE…
47
STORIES TABLE…
So our index is partial, only on state = ‘active’
Regardless of whether this was intentional, the join results is a full table scan (called “sequential scan”).
But the state column isn’t used in the query at all! Perhaps it’s a bug?
Sequential scan on a large table, in a database used by an OLTP application, is bad, because it “steals” the
database cache from many other queries, because OS will now load these pages into the memory.
48. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page 48
Proprietary and
Now query takes 3ms instead of 1000ms, and
the IO on the server drops significantly
according to this NewRelic graph:
It was meant to be there anyway :)
FIXING IT: LETS ADD STATE = “ACTIVE”
49. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
49
3.SETTING UP STREAMING REPLICATION
SCALING UP
50. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
SCALE READS BY REPLICATION
Proprietary and
• Version 9.3 and later setting up
replication is very easy
• postgresql.conf (left)
both the master & the replica
• So is electing a new master, and
switching replicas to a new
timeline.
• Each PG release seems to be
making replication even easier.
50
These settings have been tuned for SmartOS and
our application requirements (thanks PGExperts!)
51. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
• Once you have at least one streaming replica live,
you must know at all times, if the replica is falling
behind the master.
REPLICATION 101: WHERE ARE MY REPLICAS?
51
https://github.com/wanelo/nagios-checks
Our nagios checks automatically show the
difference in MB as well as the time lag:
Grab our nagios replication check here:
52. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
• One question with the replicas: can they catch
up with all the writes coming from the master?
• What if the master on SSDs, and replicas
aren’t? We tried this setup to save $$.
• And we instantly bumped into this problem:
applying WAL logs to the replicas created very
significant disk write load on non-SSD drives
• These replicas were barely able to apply writes
from the master without live traffic.
• With traffic, they would start falling behind, the
delta ever increasing.
REPLICATION 102: USE SSDS EVERYWHERE
52
The red line is the site’s error rate.
Note the correlation.
53. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
HOWTO DISTRIBUTE READS TO REPLICAS?
Proprietary and
This is a diagram of data flow between the application and the database with it’s
replicas, using pgBouncer to provide connection pooling from each app server.
53
Replica 1
Master DB
Replica 2
streaming
replication
streaming
replication
writes and some
reads
reads
reads
Application pgBouncer
reads
reads
+
writes
reads
Application Server
54. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
BUT HOW DO WE ROUTE READS TO REPLICAS?
Proprietary and
54
Replica 1
Master DB
Replica 2
streaming
replication
streaming
replication
writes and some
reads
reads
reads
Application pgBouncer
reads
reads
+
writes
reads
Application Server
• We were hoping there was a generic solution,
homelike like a pgBouncer, that would automatically
route SELECT queries to the replicas, while all “write”
requests to the master.
• Turns out that it is nearly impossible to provide a
generic tool that does this well. For instance, how do
you deal with a SELECT inside a transaction?
• As a result, most production-ready read/write
splitting solutions are built into the application itself.
• We started looking for a Ruby solution, and were
quickly unimpressed by everything we could find.
One of the biggest issues was thread-safety. Only
one of the libraries we found appeared to be thread
safe.
https://github.com/taskrabbit/makara
55. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Makara is a ruby gem from TaskRabbit that was in production there, but
only supported MySQL. We ported the database code from MySQL to
PostgreSQL.
• Was the simplest library to understand,
and port to PG
• Worked in multi-threaded environment of
Sidekiq Background Framework
• Makara automatically retries if replica goes
down
• Load balances with weights
• Was already running in production
55
READ SPLITTING WITH MAKARA
56. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
CONSIDERATIONS WHEN USING REPLICATION
Proprietary and
• Application must be tuned to support eventual consistency.
Data may not yet be on replica!
• Must explicitly force fetch from the master DB
when it’s critical (i.e. after a user account’s creation)
• We often use below pattern of first trying the fetch,
if nothing found retry on master db
56
57. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
USEFULTIP: REPLICAS CAN SPECIALIZE
Proprietary and
Background Workers can use dedicated replica not shared with
the app servers, to optimize hit rate for file system cache (ARC) on
both replicas
PostgreSQL
Master
Unicorn / Passenger
Ruby VM (times N)
App Servers Sidekiq / Resque
Background Workers
PostgreSQL
Replica 1
PostgreSQL
Replica 2
PostgreSQL
Replica 3
ARC cache warm with
queries from web traffic
ARC cache warm with
background job queries
57
58. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
BIG HEAVY READS GO THERE
Proprietary and
• Long heavy queries should run by the
background jobs against a dedicated
replica, to isolate their effect on web traffic
PostgreSQL
Master
Sidekiq / Resque
Background Workers
PostgreSQL
Replica 1
• Each type of load will produce a unique set of
data cached by the file system
58
59. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
59
4.UPGRADING (VIRTUAL) HARDWARE
SCALING UP
60. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
HARDWARE: IO & RAM
Proprietary and
60
• Sounds obvious, but better or
faster hardware is an obvious
choice when scaling out
• Large RAM will be used as file
system cache
• On Joyent’s SmartOS ARC FS
cache is very effective
• shared_buffers should be set
to 25% of RAM or 12GB,
whichever is smaller.
• Using fast SSD disk array
made an enormous
difference
• Joyent’s native 16-disk RAID managed
by ZFS instead of controller provides
excellent performance
61. (HEAVY) METAL IN THE CLOUD
Proprietary and
SSD offerings from Joyent and AWS
• Joyents “max” SSD node $12.9/hr
• AWS “max” SSD node $6.96/hr
62. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
SO WHO’S FASTER? WHO IS CHEAPER?
Proprietary and
• JOYENT
• 16 SSD drives: RAID10 + 2
• SSD Make: DCS3700
• CPU: E5-2690
2.9GHz
• AWS
• 8 SSD drives
• SSD Make: ?
• CPU: E5-2670
2.6Ghz
PERHAPS YOU GET WHAT YOU PAY FOR…
62
63. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
63
5.NO TOOL EXCELS AT EVERYTHING
SCALING UP
AND POSTGRESQL IS NO EXCEPTION.
64. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
WHEN POSTGRESQL IS NOT ENOUGH
Proprietary and
Not every type of data is well suited for storing and quickly fetching from a relational DB, even though initially it may
be convenient. For example, our initial implementation of the “text search” in PG became too slow when the # of
documents reached 1M.
• Solr is great for full text search, and deep
paginated sorted lists, such as popular, or related
products
• ElasticSearch is a superset of Solr, but scales wide
near infinitum. We ran 0.5Tb ElasticSearch cluster.
• Redis is a great data store for transient or semi-
persistent data with list, hash or set semantics
• RabbitMQ is a fantastic high performance queue,
with both point-to-point and pub-sub
communications.
64
65. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
SOME CAVEATS OF EACH STORE WE TRIED
Proprietary and
• Solr is easy to replicate to 10-20
replicas, but they take toll on
the master.
• Do not serve reads from the master.
• For high document update rate, set # of
documents to commit to a stratospheric
value.
• ElasticSearch is difficult to
manage and configure for high
availability. Professional services
cost a lot, pricing not startup
friendly.
• Redis is not a transactional, or a
txn-reliable data store despite
what anyone says. Expect all
data to go away at some point,
and always have a way to
rebuild it from the DB if it’s
critical.
• RabbitMQ is great, but
remember that queues and
messages are not “durable”, ie.
on disk by default.
65
66. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
REDIS SIDETRACK: LESSONS LEARNED
Proprietary and
This in-memory store is very good for certain applications where PostgreSQL isn’t.
• We use Redis for ActivityFeed by
precomputing each user’s feed at
write time. But we can regenerate it
if the data is lost from Redis
• We use twemproxy in front of Redis
which provides automatic
horizontal sharding and connection
pooling.
• We run clusters of 256 redis shards
across many virtual zones; sharded
redis instances use many cores,
instead of just one (as a single
instance would)
• Small redis shards can easily fork to
backup the data, as the data is
small.
• We squeezed more performance of
Redis by packaging multiple
66
I like to think of Redis as a in-memory cache with additional hash, set and list semantics. And they totally rock!
When Redis backs up data, or tries to replicate (ugh, that was rough), it forks. Memory reqs double!
67. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
67
6.MOVE EVENT-LIKE TABLES OUT OF POSTGRESQL
SCALING UP
68. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
EVENT LOGS,& APPEND-ONLY TABLES
Proprietary and
• Many analysts and business stakeholders like
to collect an ever-growing list of metrics, i.e.
• User/business events such as “registered”, “ordered”
• System events, such as “database went offline”
• Click-stream events, that follow nginx access_log file
• State changes history on key models, like an Order
• These append-only tables often start in the
database, but quickly prove to be a nuisance
• They generate very high write IO, often
overwhelming the underlying hardware
68
69. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page 69
MOVE EVENTTABLES OUT OF DATABASE
Proprietary and
• We were appending all user events into a single table
user_events
• The application was generating millions of rows per
day!
• After realizing that there was no reason this data needed
to stay in PostgreSQL we moved it out using a clever
solution, that combined:
• Event dispatch system using ruby gem Ventable
• Event recording using rsyslog
• Data analysis using a combination of AWS Redshift, and Joyent’s Manta.
• Manta is an object store with native compute facility, that supports
concurrent analysis of thousands of objects in parallel. It provides map/
reduce facility, and bash tools like awk and grep for filtering and mapping.
• We eventually migrated most of the analytics to RedShift, in order to
return to regular SQL for analytics.
70. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page 70
DETAILED BLOG POST ABOUTTHIS MIGRATION
Proprietary and
http://wanelo.ly/event-collection
72. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
72
7.TUNE POSTGRESQL & FILESYSTEM
SCALING OUT: TUNING
73. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
• Problem: zones (virtual hosts) with “write problems”
appeared to be writing 16 times more data to disk,
compared to what virtual file system reports
• vfsstat says 8Mb/sec write volume
So what’s going on?
• iostat says 128Mb/sec is actually written to disk
73
THIS HAPPENED TO US
74. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
• Turns out default ZFS block size is
128Kb, and PostgreSQL page size is
8Kb.
• Every small write that touched a
page, had to write 128Kb of a ZFS
block to the disk
• This may be good for huge
sequential writes, but not for
random access, lots of tiny writes
TUNING FILESYSTEM
74
75. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
• Solution: Joyent changed ZFS block
size for our zone, iostat write volume
dropped to 8Mb/sec
• We also added commit_delay
TUNING ZFS & PGSQL
75
76. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
• Many of these settings are the default in our open-source Chef
cookbook for installing PostgreSQL from sources
THIS KNOWLEDGE SHOULD BE PART OF INSTALL
https://github.com/wanelo-chef/postgres
• It installs PG in eg /opt/local/postgresql-9.5.0
• It configures it’s data in /var/pgsql/data95
• It allows seamless and safe upgrades of minor or major versions of
PostgreSQL, never overwriting binaries
76
77. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
ONLINE RESOURCES ON PG TUNING
Proprietary and
•Josh Berkus’s “5 steps to PostgreSQL
Performance” on SlideShare is fantastic
http://www.slideshare.net/PGExperts/five-steps-perform2013
•PostgreSQL wiki pages on performance
tuning are excellent
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
http://wiki.postgresql.org/wiki/Performance_Optimization
77
78. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
78
8.BUFFERING,SERIALIZING UPDATES OF COUNTERS
SCALING OUT: PATTERNS
79. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
•Problem: products.saves_count is
incremented every time someone saves
a product (by 1)
•At 100s of inserts/sec, that’s a lot of
updates
REDUCE WRITE IO AND LOCK CONTENTION
•Worse: 100s of concurrent requests
trying to obtain a row level lock on the
same popular product
79
80. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
BUFFERING AND SERIALIZING
• Sidekiq background job framework has two inter-
related features:
• scheduling in the future (say 10 minutes
ahead)
• UniqueJob extension
• We increment a counter in redis, and enqueue a
job that says “update product in 10 minutes”
• Once every 10 minutes popular products are
updated by adding a value stored in Redis to the
database value, and resetting Redis value to 0
80
81. BUFFERING IN PICTURES
Proprietary and
Save Product
Save Product
Save Product
1. enqueue update
request for product
with a delay
PostgreSQL
Update Request already
on the queue
3. Process Job
Redis Cache
2. increment
counter
4. Read & Reset to 0
5. Update Product
82. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
BUFFERING CONCLUSIONS
Proprietary and
• If not, to achieve read consistency,
we can display the count as
database value + redis value at read
time
• If we show objects from the
database, they might be sometimes
behind on the counter. It might be
ok…
82
83. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
83
9.OPTIMIZING SCHEMA FOR SCALE
SCALING OUT: PATTERNS
84. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
MVCC DOES COPY ON WRITE
Problem: heavy writes on the master db, due to the fact that PostgreSQL
rewrites each row for most updates.
Some exceptions exist: i.e. non-indexed integer column, a counter, timestamp
or other simple non-indexed type
• But we often index these so we can
sort by them
• So rewriting user means rewriting the
entire row
• Solution: move frequently updated
columns away
84
85. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
TOO MANY WRITES: THIS IS HOW IT BEGINS
• We notice how much writes we are
doing on the database machine,
and become curious.
• Something must not be right. What
is it?
• Quick check with
pg_stat_user_tables reveals that our
users table is doing a huge number
of updates, many of them are not
“hot” updates
• Subsequent research reveals the
following line is at fault: we update
the entire user row for each login!
85
86. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
SCHEMA TRICKS
Proprietary and
• Split wide tables that get a lot of
updates into two more more 1-1
tables, to reduce the impact of an
update
• Much less vacuuming required when
smaller tables are frequently
updated, especially if this allows the
updates to remain “hot”.
86
88. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
88
10.VERTICAL SHARDING
SCALING OUT: PATTERNS
89. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
VERTICAL SHARDING – WHAT IS IT?
Proprietary and
• Heavy tables with too many writes, can
be moved into their own separate
database
• For us it was saves: now @ 3B+ rows
• At hundreds of inserts per second, and 4 indexes, we
were feeling the pain.
• “Save” is like a “Like” on Instagram, or “Pin” on Pinterest.
• It turns out moving a single table (in
Rails) out is a not a huge effort: it took
our team 3 days
89
90. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
VERTICAL SHARDING – HOW?
Proprietary and
• Update code to point to both old and new
databases (new – only for the shared model)
• Implement any dynamic Rails association
methods as real methods
• ie. save.products becomes a method on Save
model, lookup up Products by IDs
• Update development and test setup with
two primary databases and fix all the tests
90
http://wanelo.ly/vertical-sharding
91. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
Web App
PostgreSQL
Master (Main Schema)
PostgreSQL
Replica (Main Schema)
Vertically Sharded Database
PostgreSQL
Master (Split Table)
91
APPLICATION TALKING TO
TWO DATABASES
92. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
VERTICAL SHARDING – DEPLOYING
Proprietary and
• Drop in write IO on the main DB after
splitting off the high IO table into a
dedicated compute node
92
Web App
PostgreSQL
Master (Main Schema)
PostgreSQL
Replica (Main Schema)
Vertically Sharded Database
PostgreSQL
Master (Split Table)
93. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
93
11. WRAPPING VERTICALLY SHARDED DATAWITH
MICRO SERVICES
SCALING OUT: PATTERNS
94. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
SPLITTING OFF MICRO-SERVICES
Proprietary and
• Vertical Sharding is a great precursor to a
micro-services architecture
• We already have Saves in another
database, let’s migrate it to a light-weight
HTTP service
• New service: Sinatra, client and server libs,
updated tests & development, CI, deployment
without changing db schema
• 2-3 weeks a pair of engineers level of effort
94
95. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
ADAPTER DESIGN PATTERN TO THE RESCUE
Proprietary and
Main App
Unicorn w/ Rails
PostgreSQL
HTTP
Client Adapter
Service App
Unicorn w/Sinatra
Native
Client Adaptor
We used Adapter pattern to write two client adapters: native and
HTTP, so we can use the lib, but not yet switch to HTTP
95
96. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
SERVICES CONCLUSIONS
Proprietary and
• Now we can independently scale service
backend, in particular reads by using replicas
• This prepares us for the next inevitable step:
horizontal sharding
• At a cost of added request latency, lots of
extra code, extra runtime infrastructure, and
2 weeks of work
• Do this only if you absolutely have to: it adds
complexity, more moving parts, etc. This is
not to be taken lightly!
96
97. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
97
12.SHARDING THE BACKEND BEHIND MICRO
SERVICES HORIZONTALLY
SCALING OUT: PATTERNS
98. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
HORIZONTAL SHARDING CONCEPTS
Proprietary and
• We wanted to stick with PostgreSQL for critical
data such as saves, and avoid learning a new tool.
• Really liked Instagram’s approach with schemas
• Built our own schema-based sharding in ruby,
on top of Sequel gem, and open sourced it
• It supports mapping of physical to logical
shards, and connection pooling
98
99. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
SCHEMA DESIGN FOR HORIZONTAL SHARDING
Proprietary and
https://github.com/wanelo/sequel-schema-sharding
user_id
product_id
collection_id
created_at
index__on_user_id_and_collection_id
UserSaves Sharded by user_id
product_id
user_id
updated_at
index__on_product_id_and_user_id
index__on_product_id_and_updated_at
ProductSaves Sharded by product_id
• We needed two lookups, by user_id and
by product_id hence we needed two
tables, independently sharded
• Since saves is a join table between user,
product, collection, we did not need
unique ID generated
• Composite base62 encoded ID:
fpua-1BrV-1kKEt
99
100. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
SPREADING YOUR SHARDS :)
Proprietary and
• We split saves into 8192 logical shards,
distributed across 8 PostgreSQL databases
• Running on 8 virtual zones spanning 2
physical SSD servers, 4 per compute node
• Each database has 1024 schemas (twice,
because we sharded saves into two tables)
Use our ruby library to do the this:
https://github.com/wanelo/sequel-schema-sharding
2 x 32-core 256GB RAM
16-drive SSD RAID10+2
PostgreSQL 9.3
1
3 4
2
100
101. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
QUESTION:
HOW CAN WE MIGRATE THE DATA FROM THE OLD
BACKEND TO THE NEW HORIZONTALLY SHARDED ONE,
BUT WITHOUT ANY DOWNTIME?
101
102. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
YES! NEW RECORDS GO TO BOTH
Proprietary and
HTTP Service
Old Non-Sharded Backend
New Sharded Backend
1
3 4
2
Read/Write
Background
Worker
Enqueue
Sidekiq Queue
Create Save
102
103. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
HTTP Service
Old Non-Sharded Backend
New Sharded Backend
1
3 4
2
Read/Write
Background
Worker
Enqueue
Sidekiq Queue
Create Save
Migration Script
MIGRATE OLD ROWS
We migrated several times before we got this right…
103
104. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
Proprietary and
SWAP OLD AND NEW BACKENDS
HTTP Service
Old Non-Sharded Backend
New Sharded Backend
1
3 4
2Read/Write
Background
Worker
Enqueue
Sidekiq Queue
Create Save
104
105. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
HORIZONTAL SHARDING CONCLUSIONS
Proprietary and
• This is the final destination of any scalable architecture:
just add more boxes
• Pretty sure we can now support 1,000 - 100,000
inserts/second by scaling out wide
• This effort took 2 months of 2 engineers, including the
migration, and we managed to do it with zero
downtime.
105https://github.com/wanelo/sequel-schema-sharding
• You can arrive there incrementally, like we did, without
too much added cost. But don’t start with this on a new
application!
107. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
THOUGHTS ON MICRO-SERVICES
Proprietary and
• The new micro-services infrastructure complexity
does not come for free
• It requires new code, new automation, testing,
deployment, monitoring, graphing, maintenance
and upgrades, and comes with it’s own unique set
of bugs and problems.
• But the advantages, in this case, by far supersede
the cost, particularly with billion+ sized data sets,
and/or large teams:
• Autonomy by ownership: a dedicated team for
each service (aka Twitter model)
• Each service can be scaled up independently.
107
https://sudo.hailoapp.com/services/2015/03/09/journey-into-a-microservice-world-part-3/
108. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page
• Hopefully you can see that it is
possible to scale application to
millions of users methodically, and
incrementally.
• Patterns presented here can be readily
copied, and implemented on any
application that’s running slow, or
having difficulty supporting a growing
user-base. Congrats, these are great
problems to have!
108
CONCLUDING THOUGHTS
109. “From Obvious to Ingenious: Scaling Web Applications atop PostgreSQL”, by Konstantin Gredeskoul, CTO Wanelo.com. | Twitter: @kig | Github: @kigster Page 109
ACKNOWLEDGEMENTS
Finally, our learnings and discovery of these solutions
would not have been possible without:
• Obsessive monitoring and debugging, made possible
by SmartOS, PostgreSQL and such tools as: dTrace,
htop, vfsstat, iostat, prstat, nagios,
statsd, graphite
• Excellent performance insight products from NewRelic
and Circonus
• Help from the wonderful folks at PGExperts and Joyent
• And relentless professionalism, zeal and ingenuity of
the Wanelo’s Engineering Team.
This is an early Wanelo team watching
“Mean Girls” as part of cultural education.