Facebook has over 500 million active users, with half logging in every day. It processes over 4 trillion feed actions per day and caches over 2 trillion objects. Facebook has scaled to over 1 million active users per engineer, significantly more efficient than other large tech companies. To achieve this scale, Facebook relies on techniques like frequent small releases, dark launching of major changes, and shedding load during outages to maintain reliability as the site grows enormously.
Shared Personalization Service - How To Scale to 15K RPS, Patrice PellandFuenteovejuna
The document summarizes the Shared Personalization Service (SPS) developed by Microsoft to enable explicit and implicit personalization at scale. SPS uses AppFabric caching and SQL partitioning to support 150 million home page visits per month with peaks of 15,000 requests per second. It provides a stateless architecture with no single point of failure and aims for read latencies less than 25ms and update latencies less than 50ms. SPS deploys across multiple regions using caching, databases, and file servers for availability and scalability.
Social Monitoring Tool codename Looking Glass, Patrice PellandFuenteovejuna
The document discusses the progression of a social monitoring incubation project from a Windows Server backend to Windows Azure.
It began as a Silverlight application with code reuse across Windows Phone and iOS. The initial backend used Windows Server and SQL Server but struggled to scale.
It then moved to using Windows Azure web and worker roles with SQL Azure and Azure storage to improve scalability. This allowed for distributed data acquisition and indexing to handle larger datasets.
The final phase utilized multiple Azure services including data aggregators, indexers, blob storage and visualization to create a more scalable, reliable and complex social monitoring solution. Moving to Azure addressed the project's scalability issues in a cost effective way.
Facebook has over 500 million active users, with half logging in every day. It processes over 4 trillion feed actions per day and caches over 2 trillion objects. Facebook has scaled to over 1 million active users per engineer, significantly more efficient than other large tech companies. To achieve this scale, Facebook relies on techniques like frequent small releases, dark launching of major changes, and shedding load during outages to maintain reliability as the site grows enormously.
Shared Personalization Service - How To Scale to 15K RPS, Patrice PellandFuenteovejuna
The document summarizes the Shared Personalization Service (SPS) developed by Microsoft to enable explicit and implicit personalization at scale. SPS uses AppFabric caching and SQL partitioning to support 150 million home page visits per month with peaks of 15,000 requests per second. It provides a stateless architecture with no single point of failure and aims for read latencies less than 25ms and update latencies less than 50ms. SPS deploys across multiple regions using caching, databases, and file servers for availability and scalability.
Social Monitoring Tool codename Looking Glass, Patrice PellandFuenteovejuna
The document discusses the progression of a social monitoring incubation project from a Windows Server backend to Windows Azure.
It began as a Silverlight application with code reuse across Windows Phone and iOS. The initial backend used Windows Server and SQL Server but struggled to scale.
It then moved to using Windows Azure web and worker roles with SQL Azure and Azure storage to improve scalability. This allowed for distributed data acquisition and indexing to handle larger datasets.
The final phase utilized multiple Azure services including data aggregators, indexers, blob storage and visualization to create a more scalable, reliable and complex social monitoring solution. Moving to Azure addressed the project's scalability issues in a cost effective way.
Real time indexes in Sphinx, Yaroslav VorozhkoFuenteovejuna
This presentation introduces Sphinx Search 1.10's new real-time indexes feature. It discusses the problems with plain indexes and how real-time indexes allow indexing and updating data on the fly directly from MySQL. Performance tests show real-time indexes using less disk space and performing better for single and multi-queries compared to plain indexes, especially under load. The presentation demonstrates easy creation and CRUD of real-time indexes and migration tools and strategies.
InnoDB Architecture and Performance Optimization, Peter ZaitsevFuenteovejuna
This document provides an overview of the Innodb architecture and performance optimization. It discusses the general architecture including row-based storage, tablespaces, logs, and the buffer pool. It covers topics like indexing, transactions, locking, and multi-versioning concurrency control. Optimization techniques are presented such as tuning memory configuration, disk I/O, and garbage collection parameters. Understanding the internal workings is key to advanced performance tuning of the Innodb storage engine in MySQL.
Goal Driven Performance Optimization, Peter ZaitsevFuenteovejuna
The document discusses goal driven performance optimization. It emphasizes setting clear performance goals based on metrics like response time and throughput. Goals should be set for different types of requests and measured regularly. Instrumentation of the system is important to identify bottlenecks and queries that are causing slowdowns. The key is to prioritize optimization efforts on the most important user interactions that are not meeting goals. Taking a goal-driven approach focuses work on the most significant performance issues.
The Magic of Hot Streaming Replication, Bruce MomjianFuenteovejuna
This document discusses PostgreSQL 9.0's new capabilities for maintaining a current standby server and issuing read-only queries on the standby server. It covers how to configure continuous archiving from the primary server to the standby, and how to configure the standby for streaming replication so it can accept queries. The summary demonstrates streaming replication by creating and inserting a row on the primary which then appears on the standby.
Rapid Upgrades With Pg_Upgrade, Bruce MomjianFuenteovejuna
Pg_Upgrade allows migration between major releases of Postgres without dumping and reloading data. It works by installing the new Postgres system tables while continuing to use the data files from the previous version. Pg_Upgrade freezes all rows in the new cluster, copies over transaction logs and IDs from the old cluster, restores the database schema, and finally copies over user data files. This process allows for much faster upgrades than traditional dump and restore methods.
Managing replication of PostgreSQL, Simon RiggsFuenteovejuna
This document discusses PostgreSQL replication. It covers the different use cases for replication including high availability, scalability, and protection. It describes the different replication mechanisms in PostgreSQL, including trigger-based and log-based replication. It outlines the developments made to log shipping in different PostgreSQL versions. It focuses on streaming replication introduced in version 9.0, describing the wal sender, wal receiver, and hot standby capabilities. It discusses tools like Repmgr that help manage replication and monitor delays. Future planned features like sync replication and loose coupling of replication are also mentioned.
Real time indexes in Sphinx, Yaroslav VorozhkoFuenteovejuna
This presentation introduces Sphinx Search 1.10's new real-time indexes feature. It discusses the problems with plain indexes and how real-time indexes allow indexing and updating data on the fly directly from MySQL. Performance tests show real-time indexes using less disk space and performing better for single and multi-queries compared to plain indexes, especially under load. The presentation demonstrates easy creation and CRUD of real-time indexes and migration tools and strategies.
InnoDB Architecture and Performance Optimization, Peter ZaitsevFuenteovejuna
This document provides an overview of the Innodb architecture and performance optimization. It discusses the general architecture including row-based storage, tablespaces, logs, and the buffer pool. It covers topics like indexing, transactions, locking, and multi-versioning concurrency control. Optimization techniques are presented such as tuning memory configuration, disk I/O, and garbage collection parameters. Understanding the internal workings is key to advanced performance tuning of the Innodb storage engine in MySQL.
Goal Driven Performance Optimization, Peter ZaitsevFuenteovejuna
The document discusses goal driven performance optimization. It emphasizes setting clear performance goals based on metrics like response time and throughput. Goals should be set for different types of requests and measured regularly. Instrumentation of the system is important to identify bottlenecks and queries that are causing slowdowns. The key is to prioritize optimization efforts on the most important user interactions that are not meeting goals. Taking a goal-driven approach focuses work on the most significant performance issues.
The Magic of Hot Streaming Replication, Bruce MomjianFuenteovejuna
This document discusses PostgreSQL 9.0's new capabilities for maintaining a current standby server and issuing read-only queries on the standby server. It covers how to configure continuous archiving from the primary server to the standby, and how to configure the standby for streaming replication so it can accept queries. The summary demonstrates streaming replication by creating and inserting a row on the primary which then appears on the standby.
Rapid Upgrades With Pg_Upgrade, Bruce MomjianFuenteovejuna
Pg_Upgrade allows migration between major releases of Postgres without dumping and reloading data. It works by installing the new Postgres system tables while continuing to use the data files from the previous version. Pg_Upgrade freezes all rows in the new cluster, copies over transaction logs and IDs from the old cluster, restores the database schema, and finally copies over user data files. This process allows for much faster upgrades than traditional dump and restore methods.
Managing replication of PostgreSQL, Simon RiggsFuenteovejuna
This document discusses PostgreSQL replication. It covers the different use cases for replication including high availability, scalability, and protection. It describes the different replication mechanisms in PostgreSQL, including trigger-based and log-based replication. It outlines the developments made to log shipping in different PostgreSQL versions. It focuses on streaming replication introduced in version 9.0, describing the wal sender, wal receiver, and hot standby capabilities. It discusses tools like Repmgr that help manage replication and monitor delays. Future planned features like sync replication and loose coupling of replication are also mentioned.