SlideShare a Scribd company logo
1 of 61
Download to read offline
Scaling




                      Marty Weiner             Yashh Nelapati
                      Krypton                  Gotham City




Friday, July 27, 12
Pinterest is . . .
               An online pinboard to organize and
                    share what inspires you.



     Scaling Pinterest

Friday, July 27, 12
Friday, July 27, 12
Friday, July 27, 12
Friday, July 27, 12
Relationships




                         Marty Weiner
                         Grayskull, Eternia




     Scaling Pinterest

Friday, July 27, 12
Relationships




                         Marty Weiner
                         Grayskull, Eternia




                                              Yashh Nelapati
                                              Gotham City
     Scaling Pinterest

Friday, July 27, 12
Page Views / Day




               Mar 2010                Jan 2011                 Jan 2012




                 Mar 2010   Jan 2011                 Jan 2012     May 2012




     Scaling Pinterest

Friday, July 27, 12
Page Views / Day


                            ·   RackSpace
                            ·   1 small Web Engine
               Mar 2010                        Jan 2011                 Jan 2012
                            ·   1 small MySQL DB
                            ·   1 Engineer

                 Mar 2010           Jan 2011                 Jan 2012     May 2012




     Scaling Pinterest

Friday, July 27, 12
Page Views / Day




                 Mar 2010   Jan 2011                 Jan 2012   May 2012




     Scaling Pinterest

Friday, July 27, 12
·   Amazon EC2 + S3 +Views / Day
                                              Page
                                                   CloudFront
                            ·   1 NGinX, 4 Web Engines
                            ·   1 MySQL DB + 1 Read Slave
                            ·   1 Task Queue + 2 Task Processors
                            ·   1 MongoDB
                            ·   2 Engineers
                 Mar 2010                     Jan 2011             Jan 2012   May 2012




     Scaling Pinterest

Friday, July 27, 12
Page Views / Day




               Mar 2010               Jan 2011                 Jan 2012
                Mar 2010   Jan 2011                 Jan 2012      May 2012




     Scaling Pinterest

Friday, July 27, 12
·    Amazon EC2 + S3 + CloudFront
                      ·    2 NGinX, 16 Web EnginesDay 2 API Engines
                                               Page Views / +

                      ·    5 Functionally Sharded MySQL DB + 9 read slaves
                      ·    4 Cassandra Nodes
                      ·    15 Membase Nodes (3 separate clusters)
                      ·    8 Memcache Nodes
                      ·    10 Redis Nodes
                      ·
               Mar 2010
                           3 Task Routers + 4 Task Processors
                                                      Jan 2011                Jan 2012
                Mar 2010                   Jan 2011                Jan 2012      May 2012
                      ·    4 Elastic Search Nodes
                      ·    3 Mongo Clusters
                      ·    3 Engineers

     Scaling Pinterest

Friday, July 27, 12
Lesson Learned #1
                         It will fail. Keep it simple.




     Scaling Pinterest

Friday, July 27, 12
Page Views / Day




                 Mar 2010   Jan 2011                 Jan 2012   May 2012




     Scaling Pinterest

Friday, July 27, 12
·   Amazon EC2 + S3 + Akamai, ELB
                                                     Page Views / Day
                            ·   90 Web Engines + 50 API Engines
                            ·   66 MySQL DBs (m1.xlarge) + 1 slave each
                            ·   59 Redis Instances
                            ·   51 Memcache Instances
                            ·   1 Redis Task Manager + 25 Task Processors
                 Mar 2010   ·   Sharded Solr   Jan 2011                 Jan 2012   May 2012


                            ·   6 Engineers



     Scaling Pinterest

Friday, July 27, 12
Page Views / Day




                 Mar 2010   Jan 2011                 Jan 2012   May 2012




     Scaling Pinterest

Friday, July 27, 12
·   Amazon EC2 + S3 + Edge Cast, ELB
                                                     Page Views / Day
                            ·   135 Web Engines + 75 API Engines
                            ·   80 MySQL DBs (m1.xlarge) + 1 slave each
                            ·   110 Redis Instances
                            ·   60 Memcache Instances
                            ·   2 Redis Task Manager + 60 Task Processors

                 Mar 2010
                            ·   Sharded Solr   Jan 2011                 Jan 2012   May 2012


                            ·   25 Engineers



     Scaling Pinterest

Friday, July 27, 12
Why Amazon EC2/S3?

                      · Very good reliability, reporting, and support
                      · Very good peripherals, such as managed cache,
                         DB, load balancing, DNS, map reduce, and more...
                      · New instances ready in seconds




     Scaling Pinterest

Friday, July 27, 12
Why Amazon EC2/S3?

                      · Very good reliability, reporting, and support
                      · Very good peripherals, such as managed cache,
                         DB, load balancing, DNS, map reduce, and more...
                      · New instances ready in seconds

                      · Con: Limited choice


     Scaling Pinterest

Friday, July 27, 12
Why Amazon EC2/S3?

                      · Very good reliability, reporting, and support
                      · Very good peripherals, such as managed cache,
                         DB, load balancing, DNS, map reduce, and more...
                      · New instances ready in seconds

                      · Con: Limited choice
                      · Pro: Limited choice

     Scaling Pinterest

Friday, July 27, 12
Why MySQL?
                      ·   Extremely mature
                      ·   Well known and well liked
                      ·   Rarely catastrophic loss of data
                      ·   Response time to request rate increases linearly
                      ·   Very good software support - XtraBackup, Innotop, Maatkit
                      ·   Solid active community
                      ·   Very good support from Percona
                      ·   Free



     Scaling Pinterest

Friday, July 27, 12
Why Memcache?

                      ·   Extremely mature
                      ·   Very good performance
                      ·   Well known and well liked
                      ·   Never crashes, and few failure modes
                      ·   Free




     Scaling Pinterest

Friday, July 27, 12
Why Redis?
                      ·   Variety of convenient data structures
                      ·   Has persistence and replication
                      ·   Well known and well liked
                      ·   Consistently good performance
                      ·   Few failure modes
                      ·   Free



     Scaling Pinterest

Friday, July 27, 12
Clustering
                             vs
                          Sharding




     Scaling Pinterest

Friday, July 27, 12
Clustering




                                 ·   Data distributed automatically
                                 ·   Data can move
                                 ·   Rebalances to distribute capacity
                                 ·   Nodes communicate with each other



                      Sharding
     Scaling Pinterest

Friday, July 27, 12
Clustering




                                 ·   Data distributed manually
                                 ·   Data does not move
                                 ·   Split data to distribute load
                                 ·   Nodes are not aware of each other



                      Sharding
     Scaling Pinterest

Friday, July 27, 12
Why Clustering?
                      ·   Examples: Cassandra, MemBase, HBase, Riak
                      ·   Automatically scale your datastore
                      ·   Easy to set up
                      ·   Spatially distribute and colocate your data
                      ·   High availability
                      ·   Load balancing
                      ·   No single point of failure


     Scaling Pinterest

Friday, July 27, 12
What could possibly go wrong?




                                 source: thereifixedit.com




     Scaling Pinterest

Friday, July 27, 12
Why Not Clustering?
                      ·   Still fairly young
                      ·   Fundamentally complicated
                      ·   Less community support
                      ·   Fewer engineers with working knowledge
                      ·   Difficult and scary upgrade mechanisms
                      ·   And, yes, there is a single point of failure. A BIG one.



     Scaling Pinterest

Friday, July 27, 12
Clustering Single Point of Failure




     Scaling Pinterest

Friday, July 27, 12
Clustering Single Point of Failure




     Scaling Pinterest

Friday, July 27, 12
Clustering Single Point of Failure




     Scaling Pinterest

Friday, July 27, 12
Clustering Single Point of Failure




     Scaling Pinterest

Friday, July 27, 12
Clustering Single Point of Failure


                                  Cluster
                                 Management
                                 Algorithm




     Scaling Pinterest

Friday, July 27, 12
Cluster Manager

                      · Same complex code replicated over all nodes
                      · Failure modes:
                        · Data rebalance breaks
                        · Data corruption across all nodes
                        · Improper balancing that cannot be fixed (easily)
                        · Data authority failure


     Scaling Pinterest

Friday, July 27, 12
Lesson Learned #2
                          Clustering is scary.




     Scaling Pinterest

Friday, July 27, 12
Why Sharding?
                      ·   Can split your databases to add more capacity
                      ·   Spatially distribute and colocate your data
                      ·   High availability
                      ·   Load balancing
                      ·   Algorithm for placing data is very simple
                      ·   ID generation is simplistic



     Scaling Pinterest

Friday, July 27, 12
When to shard?
                      · Sharding makes schema design harder

                      ·   Solidify site design and backend architecture
                      ·   Remove all joins and complex queries, add cache
                      ·   Functionally shard as much as possible
                      ·   Still growing? Shard.



     Scaling Pinterest

Friday, July 27, 12
Our Transition
                               1 DB + Foreign Keys + Joins

                              1 DB + Denormalized + Cache

                               1 DB + Read slaves + Cache

             Several functionally sharded DBs + Read slaves + Cache

                         ID sharded DBs + Backup slaves + Cache



     Scaling Pinterest

Friday, July 27, 12
Watch out for...
                      ·   Cannot perform most JOINS
                      ·   No transaction capabilities
                      ·   Extra effort to maintain unique constraints
                      ·   Schema changes requires more planning
                      ·   Single report requires running same query on all
                          shards



     Scaling Pinterest

Friday, July 27, 12
How we sharded




     Scaling Pinterest

Friday, July 27, 12
Sharded Server Topology




                         db00001       db00513        db03072        db03584
                         db00002       db00514        db03073        db03585
                           .......       .......        .......        .......
                         db00512       db01024        db03583        db04096



                         Initially, 8 physical servers, each with 512 DBs

     Scaling Pinterest

Friday, July 27, 12
High Availability




                         db00001          db00513      db03072     db03584
                         db00002          db00514      db03073     db03585
                           .......          .......      .......     .......
                         db00512          db01024      db03583     db04096



                                      Multi Master replication
     Scaling Pinterest

Friday, July 27, 12
Increased load on DB?


                                                           db00001
                                                           db00002
                                                             .......
                                                           db00256




                               db00001
                               db00002
                                 .......                   db00257
                               db00512                     db00258
                                                             .......
                                                           db00512

                      To increase capacity, a server is replicated and the
                      new replica becomes responsible for some DBs
     Scaling Pinterest

Friday, July 27, 12
ID Structure
                                            64 bits


                            Shard ID    Type          Local ID

              · A lookup data structure has physical server to shard
                      ID range (cached by each app server process)
              · Shard ID denotes which shard
              · Type denotes object type (e.g., pins)
              · Local ID denotes position in table

     Scaling Pinterest

Friday, July 27, 12
Why not an ID service?


              · It is a single point of failure
              · Extra look up to compute a UUID




     Scaling Pinterest

Friday, July 27, 12
Lookup Structure
                                       {“sharddb001a”: ( 1, 512),
                                        “sharddb002b”: ( 513, 1024),
                                        “sharddb003a”: (1025, 1536),
                                         ...
                                        “sharddb008b”: (3585, 4096)}



                         sharddb003a                     DB01025             users


                                                           users         1   ser-data
                                                       user_has_boards   2   ser-data
                                                          boards         3   ser-data



     Scaling Pinterest

Friday, July 27, 12
ID Structure

              ·       New users are randomly distributed across shards
              ·       Boards, pins, etc. try to be collocated with user
              ·       Local ID’s are assigned by auto-increment
              ·       Enough ID space for 65536 shards, but only first
                      4096 opened initially. Can expand horizontally.




     Scaling Pinterest

Friday, July 27, 12
Objects and Mappings
                      · Object tables (e.g., pin, board, user, comment)
                        · Local ID MySQL blob (JSON / Serialized thrift)
                      · Mapping tables (e.g., user has boards, pin has likes)
                        · Full ID Full ID (+ timestamp)
                        · Naming schema is noun_verb_noun
                      · Queries are PK or index lookups (no joins)
                      · Data DOES NOT MOVE
                      · All tables exist on all shards
                      · No schema changes required (index = new table)
     Scaling Pinterest

Friday, July 27, 12
Loading a Page
       · Rendering user profile
              SELECT     body FROM users WHERE id=<local_user_id>
              SELECT     board_id FROM user_has_boards WHERE user_id=<user_id>
              SELECT     body FROM boards WHERE id IN (<board_ids>)
              SELECT     pin_id FROM board_has_pins WHERE board_id=<board_id>
              SELECT     body FROM pins WHERE id IN (pin_ids)

       · Most of these calls will be a cache hit
       · Omitting offset/limits and mapping sequence id sort


     Scaling Pinterest

Friday, July 27, 12
Scripting
              · Must get old data into your shiny new shard
              · 500M pins, 1.6B follower rows, etc
              · Build a scripting farm
                · Spawn more workers and complete the task faster
              · Pyres - based on Github’s Resque queue



     Scaling Pinterest

Friday, July 27, 12
Caching
                      · Redis lists to cache mappings
                        · lpush user:82363:pins 7233494
                        · lrange user:82363:pins 0 49
                      · Use Memcache to cache objects
                      · Shard caches based upon pools
                        · Better stats
                        · Easier to scale and manage

     Scaling Pinterest

Friday, July 27, 12
Caching with decorators



                      def user_get_many(self, user_ids):
                            cursor = get_mysql_conn().cursor
                            cursor.execute(“SELECT * FROM users WHERE id in (%s)”, user_ids)
                            return cursor.fetchall()




     Scaling Pinterest

Friday, July 27, 12
Caching with decorators
                      @mc_objects(USER_MC_CONN,
                          format = “user:%d”,
                          version=1,
                          serialization=simplejson
                          expire_time=0)
                      def user_get_many(self, user_ids):
                            cursor = get_mysql_conn().cursor
                            cursor.execute(“SELECT * FROM users WHERE id in (%s)”, user_ids)
                            return cursor.fetchall()




     Scaling Pinterest

Friday, July 27, 12
Caching with decorators



                      def get_board_pins(self, board_id, offset, limit):
                             cursor = get_mysql_conn().cursor
                             cursor.execute(“SELECT pin_id FROM bp WHERE id =%s OFFSET
                                              %d LIMIT %d”, board_id, offset, limit)
                             return cursor.fetchall()




     Scaling Pinterest

Friday, July 27, 12
Caching with decorators

                      @paged_list(BOARD_REDIS_CONN,
                          format = “b:p:%d”,
                          version=1,
                          expire_time=24*60*60)
                      def get_board_pins(self, board_id, offset, limit):
                             cursor = get_mysql_conn().cursor
                             cursor.execute(“SELECT pin_id FROM bp WHERE id =%s OFFSET
                                              %d LIMIT %d”, board_id, offset, limit)
                             return cursor.fetchall()




     Scaling Pinterest

Friday, July 27, 12
Current problems
              · Service Based Architecture
                · Connection limits
                · Isolation of functionality
                · Isolation of access (security)
              · Scaling the Team



     Scaling Pinterest

Friday, July 27, 12
Lesson Learned #3
                             Keep it fun.




     Scaling Pinterest

Friday, July 27, 12
We are Hiring!
                          jobs@pinterest.com




     Scaling Pinterest

Friday, July 27, 12
Questions?

                         marty@pinterest.com   yashh@pinterest.com




     Scaling Pinterest

Friday, July 27, 12

More Related Content

Recently uploaded

ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 

Recently uploaded (20)

ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 

Featured

Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellSaba Software
 
Introduction to C Programming Language
Introduction to C Programming LanguageIntroduction to C Programming Language
Introduction to C Programming LanguageSimplilearn
 
The Pixar Way: 37 Quotes on Developing and Maintaining a Creative Company (fr...
The Pixar Way: 37 Quotes on Developing and Maintaining a Creative Company (fr...The Pixar Way: 37 Quotes on Developing and Maintaining a Creative Company (fr...
The Pixar Way: 37 Quotes on Developing and Maintaining a Creative Company (fr...Palo Alto Software
 
9 Tips for a Work-free Vacation
9 Tips for a Work-free Vacation9 Tips for a Work-free Vacation
9 Tips for a Work-free VacationWeekdone.com
 

Featured (20)

Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 
Introduction to C Programming Language
Introduction to C Programming LanguageIntroduction to C Programming Language
Introduction to C Programming Language
 
The Pixar Way: 37 Quotes on Developing and Maintaining a Creative Company (fr...
The Pixar Way: 37 Quotes on Developing and Maintaining a Creative Company (fr...The Pixar Way: 37 Quotes on Developing and Maintaining a Creative Company (fr...
The Pixar Way: 37 Quotes on Developing and Maintaining a Creative Company (fr...
 
9 Tips for a Work-free Vacation
9 Tips for a Work-free Vacation9 Tips for a Work-free Vacation
9 Tips for a Work-free Vacation
 

MySQL Meetup July_2012-scaling_pinterest

  • 1. Scaling Marty Weiner Yashh Nelapati Krypton Gotham City Friday, July 27, 12
  • 2. Pinterest is . . . An online pinboard to organize and share what inspires you. Scaling Pinterest Friday, July 27, 12
  • 6. Relationships Marty Weiner Grayskull, Eternia Scaling Pinterest Friday, July 27, 12
  • 7. Relationships Marty Weiner Grayskull, Eternia Yashh Nelapati Gotham City Scaling Pinterest Friday, July 27, 12
  • 8. Page Views / Day Mar 2010 Jan 2011 Jan 2012 Mar 2010 Jan 2011 Jan 2012 May 2012 Scaling Pinterest Friday, July 27, 12
  • 9. Page Views / Day · RackSpace · 1 small Web Engine Mar 2010 Jan 2011 Jan 2012 · 1 small MySQL DB · 1 Engineer Mar 2010 Jan 2011 Jan 2012 May 2012 Scaling Pinterest Friday, July 27, 12
  • 10. Page Views / Day Mar 2010 Jan 2011 Jan 2012 May 2012 Scaling Pinterest Friday, July 27, 12
  • 11. · Amazon EC2 + S3 +Views / Day Page CloudFront · 1 NGinX, 4 Web Engines · 1 MySQL DB + 1 Read Slave · 1 Task Queue + 2 Task Processors · 1 MongoDB · 2 Engineers Mar 2010 Jan 2011 Jan 2012 May 2012 Scaling Pinterest Friday, July 27, 12
  • 12. Page Views / Day Mar 2010 Jan 2011 Jan 2012 Mar 2010 Jan 2011 Jan 2012 May 2012 Scaling Pinterest Friday, July 27, 12
  • 13. · Amazon EC2 + S3 + CloudFront · 2 NGinX, 16 Web EnginesDay 2 API Engines Page Views / + · 5 Functionally Sharded MySQL DB + 9 read slaves · 4 Cassandra Nodes · 15 Membase Nodes (3 separate clusters) · 8 Memcache Nodes · 10 Redis Nodes · Mar 2010 3 Task Routers + 4 Task Processors Jan 2011 Jan 2012 Mar 2010 Jan 2011 Jan 2012 May 2012 · 4 Elastic Search Nodes · 3 Mongo Clusters · 3 Engineers Scaling Pinterest Friday, July 27, 12
  • 14. Lesson Learned #1 It will fail. Keep it simple. Scaling Pinterest Friday, July 27, 12
  • 15. Page Views / Day Mar 2010 Jan 2011 Jan 2012 May 2012 Scaling Pinterest Friday, July 27, 12
  • 16. · Amazon EC2 + S3 + Akamai, ELB Page Views / Day · 90 Web Engines + 50 API Engines · 66 MySQL DBs (m1.xlarge) + 1 slave each · 59 Redis Instances · 51 Memcache Instances · 1 Redis Task Manager + 25 Task Processors Mar 2010 · Sharded Solr Jan 2011 Jan 2012 May 2012 · 6 Engineers Scaling Pinterest Friday, July 27, 12
  • 17. Page Views / Day Mar 2010 Jan 2011 Jan 2012 May 2012 Scaling Pinterest Friday, July 27, 12
  • 18. · Amazon EC2 + S3 + Edge Cast, ELB Page Views / Day · 135 Web Engines + 75 API Engines · 80 MySQL DBs (m1.xlarge) + 1 slave each · 110 Redis Instances · 60 Memcache Instances · 2 Redis Task Manager + 60 Task Processors Mar 2010 · Sharded Solr Jan 2011 Jan 2012 May 2012 · 25 Engineers Scaling Pinterest Friday, July 27, 12
  • 19. Why Amazon EC2/S3? · Very good reliability, reporting, and support · Very good peripherals, such as managed cache, DB, load balancing, DNS, map reduce, and more... · New instances ready in seconds Scaling Pinterest Friday, July 27, 12
  • 20. Why Amazon EC2/S3? · Very good reliability, reporting, and support · Very good peripherals, such as managed cache, DB, load balancing, DNS, map reduce, and more... · New instances ready in seconds · Con: Limited choice Scaling Pinterest Friday, July 27, 12
  • 21. Why Amazon EC2/S3? · Very good reliability, reporting, and support · Very good peripherals, such as managed cache, DB, load balancing, DNS, map reduce, and more... · New instances ready in seconds · Con: Limited choice · Pro: Limited choice Scaling Pinterest Friday, July 27, 12
  • 22. Why MySQL? · Extremely mature · Well known and well liked · Rarely catastrophic loss of data · Response time to request rate increases linearly · Very good software support - XtraBackup, Innotop, Maatkit · Solid active community · Very good support from Percona · Free Scaling Pinterest Friday, July 27, 12
  • 23. Why Memcache? · Extremely mature · Very good performance · Well known and well liked · Never crashes, and few failure modes · Free Scaling Pinterest Friday, July 27, 12
  • 24. Why Redis? · Variety of convenient data structures · Has persistence and replication · Well known and well liked · Consistently good performance · Few failure modes · Free Scaling Pinterest Friday, July 27, 12
  • 25. Clustering vs Sharding Scaling Pinterest Friday, July 27, 12
  • 26. Clustering · Data distributed automatically · Data can move · Rebalances to distribute capacity · Nodes communicate with each other Sharding Scaling Pinterest Friday, July 27, 12
  • 27. Clustering · Data distributed manually · Data does not move · Split data to distribute load · Nodes are not aware of each other Sharding Scaling Pinterest Friday, July 27, 12
  • 28. Why Clustering? · Examples: Cassandra, MemBase, HBase, Riak · Automatically scale your datastore · Easy to set up · Spatially distribute and colocate your data · High availability · Load balancing · No single point of failure Scaling Pinterest Friday, July 27, 12
  • 29. What could possibly go wrong? source: thereifixedit.com Scaling Pinterest Friday, July 27, 12
  • 30. Why Not Clustering? · Still fairly young · Fundamentally complicated · Less community support · Fewer engineers with working knowledge · Difficult and scary upgrade mechanisms · And, yes, there is a single point of failure. A BIG one. Scaling Pinterest Friday, July 27, 12
  • 31. Clustering Single Point of Failure Scaling Pinterest Friday, July 27, 12
  • 32. Clustering Single Point of Failure Scaling Pinterest Friday, July 27, 12
  • 33. Clustering Single Point of Failure Scaling Pinterest Friday, July 27, 12
  • 34. Clustering Single Point of Failure Scaling Pinterest Friday, July 27, 12
  • 35. Clustering Single Point of Failure Cluster Management Algorithm Scaling Pinterest Friday, July 27, 12
  • 36. Cluster Manager · Same complex code replicated over all nodes · Failure modes: · Data rebalance breaks · Data corruption across all nodes · Improper balancing that cannot be fixed (easily) · Data authority failure Scaling Pinterest Friday, July 27, 12
  • 37. Lesson Learned #2 Clustering is scary. Scaling Pinterest Friday, July 27, 12
  • 38. Why Sharding? · Can split your databases to add more capacity · Spatially distribute and colocate your data · High availability · Load balancing · Algorithm for placing data is very simple · ID generation is simplistic Scaling Pinterest Friday, July 27, 12
  • 39. When to shard? · Sharding makes schema design harder · Solidify site design and backend architecture · Remove all joins and complex queries, add cache · Functionally shard as much as possible · Still growing? Shard. Scaling Pinterest Friday, July 27, 12
  • 40. Our Transition 1 DB + Foreign Keys + Joins 1 DB + Denormalized + Cache 1 DB + Read slaves + Cache Several functionally sharded DBs + Read slaves + Cache ID sharded DBs + Backup slaves + Cache Scaling Pinterest Friday, July 27, 12
  • 41. Watch out for... · Cannot perform most JOINS · No transaction capabilities · Extra effort to maintain unique constraints · Schema changes requires more planning · Single report requires running same query on all shards Scaling Pinterest Friday, July 27, 12
  • 42. How we sharded Scaling Pinterest Friday, July 27, 12
  • 43. Sharded Server Topology db00001 db00513 db03072 db03584 db00002 db00514 db03073 db03585 ....... ....... ....... ....... db00512 db01024 db03583 db04096 Initially, 8 physical servers, each with 512 DBs Scaling Pinterest Friday, July 27, 12
  • 44. High Availability db00001 db00513 db03072 db03584 db00002 db00514 db03073 db03585 ....... ....... ....... ....... db00512 db01024 db03583 db04096 Multi Master replication Scaling Pinterest Friday, July 27, 12
  • 45. Increased load on DB? db00001 db00002 ....... db00256 db00001 db00002 ....... db00257 db00512 db00258 ....... db00512 To increase capacity, a server is replicated and the new replica becomes responsible for some DBs Scaling Pinterest Friday, July 27, 12
  • 46. ID Structure 64 bits Shard ID Type Local ID · A lookup data structure has physical server to shard ID range (cached by each app server process) · Shard ID denotes which shard · Type denotes object type (e.g., pins) · Local ID denotes position in table Scaling Pinterest Friday, July 27, 12
  • 47. Why not an ID service? · It is a single point of failure · Extra look up to compute a UUID Scaling Pinterest Friday, July 27, 12
  • 48. Lookup Structure {“sharddb001a”: ( 1, 512), “sharddb002b”: ( 513, 1024), “sharddb003a”: (1025, 1536), ... “sharddb008b”: (3585, 4096)} sharddb003a DB01025 users users 1 ser-data user_has_boards 2 ser-data boards 3 ser-data Scaling Pinterest Friday, July 27, 12
  • 49. ID Structure · New users are randomly distributed across shards · Boards, pins, etc. try to be collocated with user · Local ID’s are assigned by auto-increment · Enough ID space for 65536 shards, but only first 4096 opened initially. Can expand horizontally. Scaling Pinterest Friday, July 27, 12
  • 50. Objects and Mappings · Object tables (e.g., pin, board, user, comment) · Local ID MySQL blob (JSON / Serialized thrift) · Mapping tables (e.g., user has boards, pin has likes) · Full ID Full ID (+ timestamp) · Naming schema is noun_verb_noun · Queries are PK or index lookups (no joins) · Data DOES NOT MOVE · All tables exist on all shards · No schema changes required (index = new table) Scaling Pinterest Friday, July 27, 12
  • 51. Loading a Page · Rendering user profile SELECT body FROM users WHERE id=<local_user_id> SELECT board_id FROM user_has_boards WHERE user_id=<user_id> SELECT body FROM boards WHERE id IN (<board_ids>) SELECT pin_id FROM board_has_pins WHERE board_id=<board_id> SELECT body FROM pins WHERE id IN (pin_ids) · Most of these calls will be a cache hit · Omitting offset/limits and mapping sequence id sort Scaling Pinterest Friday, July 27, 12
  • 52. Scripting · Must get old data into your shiny new shard · 500M pins, 1.6B follower rows, etc · Build a scripting farm · Spawn more workers and complete the task faster · Pyres - based on Github’s Resque queue Scaling Pinterest Friday, July 27, 12
  • 53. Caching · Redis lists to cache mappings · lpush user:82363:pins 7233494 · lrange user:82363:pins 0 49 · Use Memcache to cache objects · Shard caches based upon pools · Better stats · Easier to scale and manage Scaling Pinterest Friday, July 27, 12
  • 54. Caching with decorators def user_get_many(self, user_ids): cursor = get_mysql_conn().cursor cursor.execute(“SELECT * FROM users WHERE id in (%s)”, user_ids) return cursor.fetchall() Scaling Pinterest Friday, July 27, 12
  • 55. Caching with decorators @mc_objects(USER_MC_CONN, format = “user:%d”, version=1, serialization=simplejson expire_time=0) def user_get_many(self, user_ids): cursor = get_mysql_conn().cursor cursor.execute(“SELECT * FROM users WHERE id in (%s)”, user_ids) return cursor.fetchall() Scaling Pinterest Friday, July 27, 12
  • 56. Caching with decorators def get_board_pins(self, board_id, offset, limit): cursor = get_mysql_conn().cursor cursor.execute(“SELECT pin_id FROM bp WHERE id =%s OFFSET %d LIMIT %d”, board_id, offset, limit) return cursor.fetchall() Scaling Pinterest Friday, July 27, 12
  • 57. Caching with decorators @paged_list(BOARD_REDIS_CONN, format = “b:p:%d”, version=1, expire_time=24*60*60) def get_board_pins(self, board_id, offset, limit): cursor = get_mysql_conn().cursor cursor.execute(“SELECT pin_id FROM bp WHERE id =%s OFFSET %d LIMIT %d”, board_id, offset, limit) return cursor.fetchall() Scaling Pinterest Friday, July 27, 12
  • 58. Current problems · Service Based Architecture · Connection limits · Isolation of functionality · Isolation of access (security) · Scaling the Team Scaling Pinterest Friday, July 27, 12
  • 59. Lesson Learned #3 Keep it fun. Scaling Pinterest Friday, July 27, 12
  • 60. We are Hiring! jobs@pinterest.com Scaling Pinterest Friday, July 27, 12
  • 61. Questions? marty@pinterest.com yashh@pinterest.com Scaling Pinterest Friday, July 27, 12