This document discusses using PostgreSQL for a large database at Inmobi, an independent mobile ad network. It covers topics like partitioning the database into tables by date for improved performance, choosing indexes to optimize queries, ensuring high availability through streaming replication, and automating regular maintenance and archiving of old data.
3. Today’s agenda
Introduction
Good for big database
Current statistics
Configurations
Partitioning
Indexing
High availability
Maintenance
4. Inmobi
Independent mobile ad network
Serves billions of ads with 1 billion monthly active
users and 30,000+ publishers
6. Good for Big Databases
Scales
Low query latency
High availability
Easy maintenance
7. Few Stats
Number of unique queries – 96,408
Number of select queries per second – 3,455 qps
Number of insert, update, delete queries per second
– 1217 qps
Total number of sessions – 25,431
Total number of sessions per second - 58
20. Choosing indexes ..
PREPARE query(int, int) AS SELECT sum(bar) FROM test
WHERE id > $1 AND id < $2
GROUP BY foo;
EXPLAIN ANALYZE EXECUTE query(100, 200);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=39.53..39.53 rows=1 width=8) (actual time=0.661..0.672 rows=7
loops=1)
-> Index Scan using test_pkey on test (cost=0.00..32.97 rows=1311 width=8) (actual
time=0.050..0.395 rows=99 loops=1)
Index Cond: ((id > $1) AND (id < $2))
Total runtime: 0.851 ms
(4 rows)
21. Archival and retention
Set retention policy
Jobs to automate archival
Managed by puppet
Need to pass parameters
Send to “S3 bucket”
Validate data
No inherit child table from parent table and drop
22. High availability
2 Servers
Streaming replication
Reporting packages (get synced by puppet)
pgbouncer
23. Database fitness
Vacuum analyze once a day
Check bloat % and vacuum
Offline
Vacuum full
Reindexing
24. What else …
Monitoring and alerting - Nagios
Graphing - graphite
Stats - pgbadger