An online pinboard where you “curate” and
“discover” things you love and go do them in
real life
What is Pinterest ?
Discovery - “Follow” Model
(Follower) (Followee)
“Follower” follows
“Followee”
Follow Interest
Graph
• Follower indicates interest in
Followee’s content
• Following feed - Content
collated from followees
“Following” Feed @Pinterest
New Pin
Follower 1
Follower 2
Follower 3
.
.
.
Fanout
Fanout
Fanout
Challenges @scale
• 100s of millions of pins/repins per month
• High fanout - billions of writes per day (High throughput)
• Billions of requests per month (Low latency and high availability)
“Following” Feed on HBase
CreationTs=100,PinId=8 CreationTs=99,PinId=6
UserId=678 <Empty> <Empty>
• Pins in following feed reverse chronologically sorted
• HBase Schema - choose wide over tall
- Exploit lexicographic sorting within columns for ordering
- Atomic transactions per user (inconsistencies get noticed at
scale)
- Row level bloom filters to eliminate unnecessary seeks
- Prefix compression (FAST_DIFF)
........
Frontend
Message Bus
Workers Follow Store
Thrift + Finagle
layer
HBASE
Pin Store
Async. task
enqueue
Task dequeue
Follow
Unfollow
New Pin
Write Path
• Follow => put
• Unfollow => delete
• New Pin => multi put
Optimizing Writes
• Increase per region memstore size
- 512M memstore -> 40M HFile
- Fewer HFiles and hence less frequent compactions
• GC tuning
- More frequent but smaller pauses
Single Points of Failure
Cluster 1 Cluster 2
Message Queue
Frontend
Dual writes
Cross cluster
replication
ZK Quorum
Writes
Reads
Single Points of Failure (contd)
Ephemeral
EBS
• No concept of HA shared storage on EC2
• Keep it simple
- HA namenode + QJM - hell, no !
- Operate two clusters each in its own AZ
HDFS NN
Am I Better Off ?
Redis vs HBase
• Sharding, load balancing and fault tolerance
• Longer feeds
• Resolve data inconsistencies
• Savings in $$
Cluster configuration
• hi1.4xlarge - SSD backed for performance parity
• HBase - 0.94.3 and 0.94.7
• HDFS - CDH 4.2.0
And many more...
• Rich pins
• Duplicate pin notifications
• Pinterest analytics
• Recommendations - “People who pinned this also pinned”
More to come...