The document discusses PostgreSQL high availability and scaling options. It covers horizontal scaling using load balancing and data partitioning across multiple servers. It also covers high availability techniques like master-slave replication, warm standby servers with point-in-time recovery, and using a heartbeat to prevent multiple servers from becoming a master. The document recommends an initial architecture with two servers using warm standby and point-in-time recovery with a heartbeat for high availability. It suggests scaling the application servers horizontally later on if more capacity is needed.
This presentation answer a lot of your questions about PostgreSQL and the Red Hat Cluster Suite.
It reviews how you can create failover/standby capabilities with the following activities:
General PostgreSQL clustering options
Overview of Red Hat Cluster Service
Identification of candidate databases for clustering
Identification of hardware for clustering
Analysis of uptime requirements and data latency
Implementation of clustering
Testing of clustering
PostgreSQL installation tips for RHCS
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...Command Prompt., Inc
Alex Alexander & Linas Virbalas
Hot standby and streaming replication will move the needle forward for high availability and scaling for a wide number of applications. Tungsten already supports clustering using warm standby. In this talk we will describe how to build clusters using the new PostgreSQL features and give our report from the trenches.
This talk will cover how hot standby and streaming replication work from a user perspective, then dive into a description of how to use them, taking Tungsten as an example. We'll cover the following issues:
* Configuration of warm standby and streaming replication
* Provisioning new standby instances
* Strategies for balancing reads across primary and standby database
* Managing failover
* Troubleshooting and gotchas
Please join us for an enlightening discussion a set of PostgreSQL features that are interesting to a wide range of PostgreSQL users.
Various HA and DR setups for Postgres Plus Advanced Server -
Active – Passive OS HA Clustering
Log Shipping Replication (Hot Standby Mode)
Hot Streaming Replication (Hot Standby Mode)
EDB Postgres Plus Failover Manager
HA with read scaling (with pg-pool)
xDB Single Master Replication (SMR)
xDB Multi Master Replication (MMR)
Use Cases
OpenStack is rapidly gaining popularity with businesses as they realize the benefits of a private cloud architecture. This presentation was delivered by Dave Page, Chief Architect, Tools & Installers at EnterpriseDB & PostgreSQL Core Team member during PG Open 2014. He addressed some of the common components of OpenStack deployments, how they can affect Postgres servers, and how users might best utilize some of the features they offer when deploying Postgres, including:
• Different configurations for the Nova compute service
• Use of the Cinder block store
• Virtual networking options with Neutron
• WAL archiving with the Swift object store
This presentation answer a lot of your questions about PostgreSQL and the Red Hat Cluster Suite.
It reviews how you can create failover/standby capabilities with the following activities:
General PostgreSQL clustering options
Overview of Red Hat Cluster Service
Identification of candidate databases for clustering
Identification of hardware for clustering
Analysis of uptime requirements and data latency
Implementation of clustering
Testing of clustering
PostgreSQL installation tips for RHCS
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...Command Prompt., Inc
Alex Alexander & Linas Virbalas
Hot standby and streaming replication will move the needle forward for high availability and scaling for a wide number of applications. Tungsten already supports clustering using warm standby. In this talk we will describe how to build clusters using the new PostgreSQL features and give our report from the trenches.
This talk will cover how hot standby and streaming replication work from a user perspective, then dive into a description of how to use them, taking Tungsten as an example. We'll cover the following issues:
* Configuration of warm standby and streaming replication
* Provisioning new standby instances
* Strategies for balancing reads across primary and standby database
* Managing failover
* Troubleshooting and gotchas
Please join us for an enlightening discussion a set of PostgreSQL features that are interesting to a wide range of PostgreSQL users.
Various HA and DR setups for Postgres Plus Advanced Server -
Active – Passive OS HA Clustering
Log Shipping Replication (Hot Standby Mode)
Hot Streaming Replication (Hot Standby Mode)
EDB Postgres Plus Failover Manager
HA with read scaling (with pg-pool)
xDB Single Master Replication (SMR)
xDB Multi Master Replication (MMR)
Use Cases
OpenStack is rapidly gaining popularity with businesses as they realize the benefits of a private cloud architecture. This presentation was delivered by Dave Page, Chief Architect, Tools & Installers at EnterpriseDB & PostgreSQL Core Team member during PG Open 2014. He addressed some of the common components of OpenStack deployments, how they can affect Postgres servers, and how users might best utilize some of the features they offer when deploying Postgres, including:
• Different configurations for the Nova compute service
• Use of the Cinder block store
• Virtual networking options with Neutron
• WAL archiving with the Swift object store
Architecture for building scalable and highly available Postgres ClusterAshnikbiz
As PostgreSQL has made way into business critical applications, many customers who are using Oracle RAC for high availability and load balancing have asked for similar functionality for using PostgreSQL.
In this Hangout session we would discuss architecture and alternatives, based on real life experience, for achieving high availability and load balancing functionality when you deploy PostgreSQL. We will also present some of the key tools and how to deploy them for effectiveness of this architecture.
Linux internals for Database administrators at Linux Piter 2016PostgreSQL-Consulting
Input-output performance problems are on every day agenda for DBAs since the databases exist. Volume of data grows rapidly and you need to get your data fast from the disk and moreover - fast to the disk. For most databases there is a more or less easy to find checklist of recommended Linux settings to maximize IO throughput. In most cases that checklist is good enough. But it is always better to understand how it works, especially if you run into some corner-cases. This talk is about how IO in Linux works, how database pages travel from disk level to database own shared memory and back and what kind of mechanisms exist to control this. We will discuss memory structures, swap and page-out daemons, filesystems, schedullers and IO methods. Some fundamental differences in IO approaches between PostgreSQL, Oracle and MySQL will be covered.
Right-Sizing your SQL Server Virtual Machineheraflux
Virtualizing your top-tier production SQL Servers is not as easy as P2V’ing it. Sometimes allocating more resources to the VM is the wrong approach, and getting it wrong will silently hurt performance. What is the most effective method for determining the ‘right’ amount of resources to allocate? What happens if the workload changes a month from now?
The methods for understanding the performance of your mission-critical SQL Servers gathered over the past ten years of SQL Server virtualization will be addressed, and valuable processes for performance statistic collection and analysis will be displayed. Come learn how to properly ‘right-size’ the resources allocated to a VM, improve the performance of your SQL Servers, and keep it maximized well into the future.
Tuning the Memory and Optimizer Parameters - This topic has come quite often in requests and suggestions from our regular viewers and followers of our Postgres Hangouts.
So we picked it up as the topic for this month's Postgres Hangout. This time we will try to explore and share on Memory and Memory .
Join us as we discuss what impacts your decision before you put a number against each of these important parameters!
My experience with embedding PostgreSQLJignesh Shah
At my current company, we embed PostgreSQL based technologies in various applications shipped as shrink-wrapped software. In this session we talk about the experience of embedding PostgreSQL where it is not directly exposed to end-user and the issues encountered on how they were resolved.
We will talk about business reasons,technical architecture of deployments, upgrades, security processes on how to work with embedded PostgreSQL databases.
This presentation surveys different ways one can geographically distribute PostgreSQL, including master-slave and multi-master solutions. It discusses pitfalls and emphasizes understanding requirements. The presentation covers some of the existing tools that are available in the community. It also touches upon upcoming PostgreSQL solutions.
Architecture for building scalable and highly available Postgres ClusterAshnikbiz
As PostgreSQL has made way into business critical applications, many customers who are using Oracle RAC for high availability and load balancing have asked for similar functionality for using PostgreSQL.
In this Hangout session we would discuss architecture and alternatives, based on real life experience, for achieving high availability and load balancing functionality when you deploy PostgreSQL. We will also present some of the key tools and how to deploy them for effectiveness of this architecture.
Linux internals for Database administrators at Linux Piter 2016PostgreSQL-Consulting
Input-output performance problems are on every day agenda for DBAs since the databases exist. Volume of data grows rapidly and you need to get your data fast from the disk and moreover - fast to the disk. For most databases there is a more or less easy to find checklist of recommended Linux settings to maximize IO throughput. In most cases that checklist is good enough. But it is always better to understand how it works, especially if you run into some corner-cases. This talk is about how IO in Linux works, how database pages travel from disk level to database own shared memory and back and what kind of mechanisms exist to control this. We will discuss memory structures, swap and page-out daemons, filesystems, schedullers and IO methods. Some fundamental differences in IO approaches between PostgreSQL, Oracle and MySQL will be covered.
Right-Sizing your SQL Server Virtual Machineheraflux
Virtualizing your top-tier production SQL Servers is not as easy as P2V’ing it. Sometimes allocating more resources to the VM is the wrong approach, and getting it wrong will silently hurt performance. What is the most effective method for determining the ‘right’ amount of resources to allocate? What happens if the workload changes a month from now?
The methods for understanding the performance of your mission-critical SQL Servers gathered over the past ten years of SQL Server virtualization will be addressed, and valuable processes for performance statistic collection and analysis will be displayed. Come learn how to properly ‘right-size’ the resources allocated to a VM, improve the performance of your SQL Servers, and keep it maximized well into the future.
Tuning the Memory and Optimizer Parameters - This topic has come quite often in requests and suggestions from our regular viewers and followers of our Postgres Hangouts.
So we picked it up as the topic for this month's Postgres Hangout. This time we will try to explore and share on Memory and Memory .
Join us as we discuss what impacts your decision before you put a number against each of these important parameters!
My experience with embedding PostgreSQLJignesh Shah
At my current company, we embed PostgreSQL based technologies in various applications shipped as shrink-wrapped software. In this session we talk about the experience of embedding PostgreSQL where it is not directly exposed to end-user and the issues encountered on how they were resolved.
We will talk about business reasons,technical architecture of deployments, upgrades, security processes on how to work with embedded PostgreSQL databases.
This presentation surveys different ways one can geographically distribute PostgreSQL, including master-slave and multi-master solutions. It discusses pitfalls and emphasizes understanding requirements. The presentation covers some of the existing tools that are available in the community. It also touches upon upcoming PostgreSQL solutions.
The latest version of my PostgreSQL introduction for IL-TechTalks, a free service to introduce the Israeli hi-tech community to new and interesting technologies. In this talk, I describe the history and licensing of PostgreSQL, its built-in capabilities, and some of the new things that were added in the 9.1 and 9.2 releases which make it an attractive option for many applications.
In 40 minutes the audience will learn a variety of ways to make postgresql database suddenly go out of memory on a box with half a terabyte of RAM.
Developer's and DBA's best practices for preventing this will also be discussed, as well as a bit of Postgres and Linux memory management internals.
This presentation is for those who are familiar with databases and SQL, but want to learn how to move processing from their applications into the database to improve consistency, administration, and performance. Topics covered include advanced SQL features like referential integrity constraints, ANSI joins, views, rules, and triggers. The presentation also explains how to create server-side functions, operators, and custom data types in PostgreSQL.
Howdah - An Application using Pylons, PostgreSQL, Simpycity and ExceptableCommand Prompt., Inc
Aurynn Shaw
This mini-tutorial covers building a small application on Howdah, an open source, Python based web development framework by Commandprompt, Inc. We will cover the full process of designing a vertically coherent application on Howdah, integrating DB-level stored procedures, DB exception propagation through Exceptable, DB access through Simpycity, authentication through repoze.who, permissions through VerticallyChallenged, and application views through Pylons. By the end of the talk, we will have covered a full application built on The Stack, and how to cover common pitfalls in using Howdah components.
This presentation covers all aspects of PostgreSQL administration, including installation, security, file structure, configuration, reporting, backup, daily maintenance, monitoring activity, disk space computations, and disaster recovery. It shows how to control host connectivity, configure the server, find the query being run by each session, and find the disk space used by each database.
PostgreSQL is a wild beast and a wrong approach can become a slow descent into hell. We'll talk about common mistakes, confusing jargon, the online manual's lost pages (well hidden in the source code) and best practices to avoid headaches for your DBA.
Gluster for Geeks: Performance Tuning Tips & TricksGlusterFS
In this Gluster for Geeks technical webinar, Jacob Shucart, Senior Systems Engineer, will provide useful tips and tricks to make a Gluster cluster meet your performance requirements. He will review considerations for all different phases including planning, configuration, implementation, tuning, and benchmarking.
Topics covered will include:
• Protocols (CIFS, NFS, GlusterFS)
• Hardware configuration
• Tuning parameters
• Performance benchmarks
Die 10 besten PostgreSQL-Replikationsstrategien für Ihr UnternehmenEDB
Dieses Webinar hilft Ihnen, die Unterschiede zwischen den verschiedenen Replikationsansätzen zu verstehen, die Anforderungen der jeweiligen Strategie zu erkennen und sich über die Möglichkeiten klar zu werden, was mit jeder einzelnen zu erreichen ist. Damit werden Sie hoffentlich eher in der Lage sein, herauszufinden, welche PostgreSQL-Replikationsarten Sie wirklich für Ihr System benötigen.
- Wie physische und logische Replikation in PostgreSQL funktionieren
- Unterschiede zwischen synchroner und asynchroner Replikation
- Vorteile, Nachteile und Herausforderungen bei der Multi-Master-Replikation
- Welche Replikationsstrategie für unterschiedliche Use-Cases besser geeignet ist
Referent:
Borys Neselovskyi, Regional Sales Engineer DACH, EDB
------------------------------------------------------------
For more #webinars, visit http://bit.ly/EDB-Webinars
Download free #PostgreSQL whitepapers: http://bit.ly/EDB-Whitepapers
Read our #Postgres Blog http://bit.ly/EDB-Blogs
Follow us on Facebook at http://bit.ly/EDB-FB
Follow us on Twitter at http://bit.ly/EDB-Twitter
Follow us on LinkedIn at http://bit.ly/EDB-LinkedIn
Reach us via email at marketing@enterprisedb.com
WANdisco is a provider of non-stop software for global enterprises to meet the challenges of Big Data and distributed software development.
KEY HIGHLIGHTS, Session 1: Tuesday, Feb. 26, 5:15 p.m.-6 p.m.
Hadoop and HBase on the Cloud: A Case Study on Performance and Isolation
Cloud infrastructure is a flexible tool to orchestrate multiple Hadoop and HBase clusters, which provides strict isolation of data and compute resources for multiple customers. Most importantly, our benchmarks show that virtualized environment allows for higher average utilization of per-node resources. For more session information, visit http://na.apachecon.com/schedule/presentation/131/.
CO-PRESENTERS, Dr. Konstantin V. Shvachko, Chief Architect, Big Data, WANdisco and Jagane Sundar, CTO/VP Engineering, Big Data, WANdisco
A veteran Hadoop developer and respected author, Konstantin Shvachko is a technical expert specializing in efficient data structures and algorithms for large-scale distributed storage systems. Konstantin joined WANdisco through the AltoStor acquisition and before that he was founder and Chief Scientist at AltoScale, a Hadoop and HBase-as-a-Platform company acquired by VertiCloud. Konstantin played a lead architectural role at eBay, building two generations of the organization's Hadoop platform. At Yahoo!, he worked on the Hadoop Distributed File System (HDFS). Konstantin has dozens of publications and presentations to his credit and is currently a member of the Apache Hadoop PMC. Konstantin has a Ph.D. in Computer Science and M.S. in Mathematics from Moscow State University, Russia.
Jagane Sundar has extensive big data, cloud, virtualization, and networking experience and joined WANdisco through its AltoStor acquisition. Before AltoStor, Jagane was founder and CEO of AltoScale, a Hadoop and HBase-as-a-Platform company acquired by VertiCloud. His experience with Hadoop began as Director of Hadoop Performance and Operability at Yahoo! Jagane has such accomplishments to his credit as the creation of Livebackup, development of a user mode TCP Stack for Precision I/O, development of the NFS and PPP clients and parts of the TCP stack for JavaOS for Sun MicroSystems, and more. Jagane received his B.E. in Electronics and Communications Engineering from Anna University.
About WANdisco
WANdisco ( LSE : WAND ) is a provider of enterprise-ready, non-stop software solutions that enable globally distributed organizations to meet today's data challenges of secure storage, scalability and availability. WANdisco's products are differentiated by the company's patented, active-active data replication technology, serving crucial high availability (HA) requirements, including Hadoop Big Data and Application Lifecycle Management (ALM). Fortune Global 1000 companies including AT&T, Motorola, Intel and Halliburton rely on WANdisco for performance, reliability, security and availability. For additional information, please visit www.wandisco.com.
The Anatomy Of The Google Architecture Fina Lv1.1Hassy Veldstra
A comprehensive overview of Google's architecture - starting from the search page and all the way to its internal networks.
By Ed Austin, talk given at Edinburgh Techmeetup in December 2009
http://techmeetup.co.uk
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
When stars align: studies in data quality, knowledge graphs, and machine lear...
PostgreSQL Scaling And Failover
1. PostgreSQL
High Availability & Scaling
John Paulett
October 26, 2009
2. Overview
Scaling Overview
– Horizontal & Vertical Options
High Availability Overview
Other Options
Suggested Architecture
Hardware Discussion
10/26/2009 2
3. What are we trying to solve?
Survive server failure?
– Support an uptime SLA (e.g. 99.9999%)?
Application scaling?
– Support additional application demand
10/26/2009 3
4. What are we trying to solve?
Survive server failure?
– Support an uptime SLA (e.g. 99.9999%)?
Application scaling?
– Support additional application demand
→ Many options, each optimized for
different constraints
10/26/2009 4
6. How To Scale
Horizontal Scaling
– “Google” approach
– Distribute load across multiple servers
– Requires appropriate application architecture
Vertical Scaling
– “Big Iron” approach
– Single, massive machine (lots of fast processors,
RAM, & hard drives)
10/26/2009 6
7. Horizontal DB Scaling
Load Balancing
– Distribute operations to multiple servers
Partitioning
– Cut up the data (horizontal) or tables (vertical)
and put them on separate servers
– aka “sharding”
10/26/2009 7
8. Basic Problem when Load
Balancing
Difficult to maintain consistent state
between servers (remember ACID),
especially when dealing with writes
4 PostgreSQL Load Balancing Methods:
– Master-Slave Replication
– Statement-Based Replication Middleware
– Asynchronous Multimaster Replication
– Synchronous Multimaster Replication
10/26/2009 8
9. Master-Slave Replication
Master handles writes, slaves handle reads
Asynchronous replication
– Possible data loss on master failure
Slony-I
– Does not automatically propagate schema changes
– Does not offer single connection point
– Requires separate solution for master failures
10/26/2009 9
10. Statement-Based Replication
Middleware
Intercept SQL queries, send writes to all
servers, reads to any server
Possible issues using random(),
CURRENT_TIMESTAMP, & sequences
pgpool-II
– Connection Pooling, Replication, Load Balancing,
Parallel Queries, Failover
10/26/2009 10
12. Synchronous Multimaster
Replication
Writes & reads on any server
Not implemented in PostgreSQL, but
application code can mimic via two-phase
commit
10/26/2009 12
13. Load Balancing Issue
Scaling writes breaks down at a certain
point
10/26/2009 13
14. Partitioning
Requires heavy application modification
Performing queries across partitions is
problematic (not possible)
PL/Proxy can help
10/26/2009 14
15. Vertical DB Scaling
“Buying a bigger box is quick(ish). Redesigning
software is not.”
● Cal Henderson, Flickr
37 Signals Basecamp upgraded to 128 GB DB
server: “don’t need to pay the complexity tax
yet”
● David Heinemeier Hansson, Ruby on Rails
10/26/2009 15
16. Sites Running on Single DB
StackOverflow
– MS SQL, 48GB RAM, RAID 1 OS, RAID 10 for data
37Signals Basecamp
– MySQL, 128GB RAM. Dell R710 or Dell 2950
10/26/2009 16
18. High Availability
Application still up even after node failure
– (Also try to prevent failure with appropriate
hardware)
PostgreSQL High Availability Options
– pg-pool
– Shared Disk Failover
– File System Replication
– Warm Standby with Point-In-Time Recovery (PITR)
Often still need heartbeat application
10/26/2009 18
19. Shared Disk Failover
Use single disk array to hold database's
data files.
– Network Attached Storage (NAS)
– Network File System (NFS)
Disk array is central point of failure
Need heartbeat to bring 2nd server online
10/26/2009 19
20. File System Replication
File system is mirrored to another
computer
DRDB
– Linux filesystem replication
Need heartbeat to bring 2nd server online
10/26/2009 20
21. Point in Time Recovery
“Log shipping”
– Write Ahead Logs sent to and replayed on standby
– Included in PostgreSQL 8.0+
– Asynchronous - Potential loss of data
Warm Standby
– Standbys' hardware very similar to primary's
– Need heartbeat to bring 2nd server online
10/26/2009 21
22. Heartbeat
“STONITH” (Shoot the Other Node In The
Head)
– Prevent multiple nodes thinking they are the
master
Linux-HA
– Creates cluster, takes nodes out when they fail
10/26/2009 22
28. Suggested Architecture
2 nice machines
Point in Time Recovery with Heartbeat
Tune PostgreSQL
Monitor & improve slow queries
Add in Ehcache as we touch code
→ Leave horizontal scaling for another day
10/26/2009 28
30. Future Architecture
Scale up application servers horizontally
as needed
Improve DB Hardware
10/26/2009 30
31. Hardware Options
PostgreSQL typically constrained by RAM
& Disk IO, not processor
64-bit, as much memory as possible
Data Array
– RAID10 with 4 drives (not RAID 5), 15k RPM
Separate OS Drive / Array
10/26/2009 31
32. Dell R710
Processor: Xeon
4x 15k HD in RAID10
24GB (3x 8GB) RAM (up to 6x 16GB)
=$6,905
10/26/2009 32
33. Other Considerations
Should have Test environment mimic
Production
– Same database setup
– Provides environment for experimentation
Can host multiple DBs on single cluster
10/26/2009 33